explaingit

sebastianruder/nlp-progress

22,968PythonAudience · researcherComplexity · 1/5StaleLicenseSetup · easy

TLDR

A community-maintained reference tracking the best-known results and benchmarks across Natural Language Processing tasks like translation, question-answering, and sentiment analysis.

Mindmap

mindmap
  root((repo))
    What it does
      Tracks state of art
      Lists benchmarks
      Records best scores
      Covers many tasks
    NLP Tasks
      Translation
      Question answering
      Named entity recognition
      Summarization
      Sentiment analysis
    Languages covered
      English
      Chinese
      Spanish
      French
      Hindi
    Use cases
      Choose baseline model
      Understand task scope
      Find improvement gaps
      Compare approaches

Things people build with this

USE CASE 1

Find the best-performing model for a specific NLP task like machine translation or question-answering to use as a baseline.

USE CASE 2

Identify benchmark datasets and evaluation metrics for an NLP problem you're working on.

USE CASE 3

Discover how much room for improvement exists in a particular language processing task.

USE CASE 4

Compare different approaches and models to decide which direction to pursue for a new NLP project.

Tech stack

Python

Getting it running

Difficulty · easy Time to first run · 5min
Use freely for any purpose including commercial, as long as you keep the copyright notice.

In plain English

This repository is a community-maintained reference tracking the best-known results in Natural Language Processing (NLP), the field of AI concerned with understanding and generating human language. NLP is a broad field covering many specific tasks: translating text between languages, answering questions, detecting who or what is mentioned in text, summarizing documents, recognizing speech, analyzing sentiment, and dozens more. For each task, the repository lists the standard benchmark datasets used to evaluate AI models, describes what the task involves, and records the best scores achieved by published research, this is called the "state of the art" (SOTA). It covers tasks for multiple languages including English, Chinese, Vietnamese, Hindi, French, Spanish, Korean, and others. You would use this if you are an AI researcher or engineer looking to understand what problems exist in NLP, which datasets are used to test solutions, and how well current methods perform. It serves as a starting point for choosing which approach or model to build on for a new NLP project, or to understand how much room for improvement remains in a given task. This is a reading and reference resource, not runnable software. Contributions from the community are welcome.

Copy-paste prompts

Prompt 1
Show me the state-of-the-art results for named entity recognition on English datasets from the nlp-progress repository.
Prompt 2
What are the standard benchmark datasets used to evaluate machine translation models according to nlp-progress?
Prompt 3
Find the best-performing models for sentiment analysis tasks and their scores from the nlp-progress reference.
Prompt 4
List the NLP tasks covered in nlp-progress and identify which ones have the largest gap between current performance and human performance.
Prompt 5
What languages does nlp-progress track, and which language has the most NLP tasks with published benchmarks?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.