Find Python or JavaScript libraries to add NLP capabilities like sentiment analysis or text classification to your application.
Locate public datasets to train and evaluate machine translation or question-answering models.
Discover research papers and newsletters to stay current with advances in tokenization, named entity recognition, and other core NLP techniques.
Access language-specific resources for building NLP tools that work with Arabic, Chinese, Spanish, or other non-English languages.
awesome-nlp is a curated directory of resources dedicated to Natural Language Processing (NLP), the field of computer science and linguistics concerned with teaching computers to understand, analyze, and generate human language. The list is organized into many sections. Research and trends: it links to paper archives, newsletters, and illustrated explanations of influential methods. Libraries: it covers NLP software libraries across Node.js, Python, C++, Java, Kotlin, Scala, R, Clojure, Ruby, Rust, Julia, and a specialized language called NLP++. Tasks and methods: it covers tokenization (breaking text into words or subwords), part-of-speech tagging, named entity recognition (identifying people, places, organizations), text classification, sentiment analysis, topic modeling, summarization, machine translation, and question answering. Datasets: links to publicly available text corpora for training and evaluation. Language-specific sections: resources for Arabic, Chinese, Danish, Dutch, German, Hungarian, Indonesian, Korean, Persian, Polish, Portuguese, Spanish, Thai, Ukrainian, Urdu, Vietnamese, and others. The list explicitly scopes itself to core NLP tasks. It notes that general-purpose chatbots, agent frameworks, prompt templates, and code generation tools belong in other lists. Its scope includes large language models only where they directly advance a specific NLP task like summarization or machine translation.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.