explaingit

keon/awesome-nlp

📈 Trending18,529Audience · developerComplexity · 1/5ActiveLicenseSetup · easy

TLDR

A curated directory of NLP resources: research papers, libraries, datasets, and tools for understanding and processing human language across 15+ programming languages.

Mindmap

mindmap
  root((awesome-nlp))
    Research and Trends
      Paper archives
      Newsletters
      Method explanations
    Libraries
      Python tools
      JavaScript tools
      Java and Scala
      Rust and Julia
    Core NLP Tasks
      Tokenization
      Named entity recognition
      Sentiment analysis
      Machine translation
      Question answering
    Datasets
      Text corpora
      Training data
      Evaluation benchmarks
    Language Support
      Arabic and Chinese
      European languages
      Asian languages
      Other languages

Things people build with this

USE CASE 1

Find Python or JavaScript libraries to add NLP capabilities like sentiment analysis or text classification to your application.

USE CASE 2

Locate public datasets to train and evaluate machine translation or question-answering models.

USE CASE 3

Discover research papers and newsletters to stay current with advances in tokenization, named entity recognition, and other core NLP techniques.

USE CASE 4

Access language-specific resources for building NLP tools that work with Arabic, Chinese, Spanish, or other non-English languages.

Tech stack

PythonJavaScriptJavaC++RustScalaRJulia

Getting it running

Difficulty · easy Time to first run · 5min
Released to the public domain. No attribution required.

In plain English

awesome-nlp is a curated directory of resources dedicated to Natural Language Processing (NLP), the field of computer science and linguistics concerned with teaching computers to understand, analyze, and generate human language. The list is organized into many sections. Research and trends: it links to paper archives, newsletters, and illustrated explanations of influential methods. Libraries: it covers NLP software libraries across Node.js, Python, C++, Java, Kotlin, Scala, R, Clojure, Ruby, Rust, Julia, and a specialized language called NLP++. Tasks and methods: it covers tokenization (breaking text into words or subwords), part-of-speech tagging, named entity recognition (identifying people, places, organizations), text classification, sentiment analysis, topic modeling, summarization, machine translation, and question answering. Datasets: links to publicly available text corpora for training and evaluation. Language-specific sections: resources for Arabic, Chinese, Danish, Dutch, German, Hungarian, Indonesian, Korean, Persian, Polish, Portuguese, Spanish, Thai, Ukrainian, Urdu, Vietnamese, and others. The list explicitly scopes itself to core NLP tasks. It notes that general-purpose chatbots, agent frameworks, prompt templates, and code generation tools belong in other lists. Its scope includes large language models only where they directly advance a specific NLP task like summarization or machine translation.

Copy-paste prompts

Prompt 1
I need to add sentiment analysis to my Python app. What libraries does awesome-nlp recommend?
Prompt 2
Show me the datasets section from awesome-nlp for training a machine translation model.
Prompt 3
Which NLP libraries in awesome-nlp support both JavaScript and Python?
Prompt 4
I'm building a named entity recognition system. What papers and datasets does awesome-nlp link to?
Prompt 5
What language-specific NLP resources does awesome-nlp have for Arabic and Chinese?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.