nlp-tutorial is a learning resource for people studying natural language processing, the branch of machine learning that deals with text and language. The README calls it a tutorial for NLP learners using PyTorch, with the deliberate constraint that most of the models are written in fewer than 100 lines of code, comments and blank lines excluded. The point is to make the core idea of each model easy to read. The repository is organized as a curriculum that walks through the history of NLP architectures. It starts with basic word-embedding models: NNLM, which predicts the next word in a sentence and is tied to Bengio's 2003 paper; Word2Vec skip-gram, which learns word vectors and visualizes them; and FastText, used here for sentence classification. From there it moves to a convolutional approach with TextCNN for binary sentiment classification. The next section covers recurrent networks: TextRNN for predicting the next step in a sequence, TextLSTM for character-level autocomplete, and a bi-directional LSTM for predicting the next word in long sentences. The attention section adds Seq2Seq for word-level changes, Seq2Seq with attention for translation, and Bi-LSTM with attention for sentiment classification. The last section covers transformer-based models: the original Transformer from the 2017 Attention Is All You Need paper, used for translation, and BERT from 2018, used for next-sentence classification and masked-token prediction. Each entry links to the source paper and to a Google Colab notebook so a reader can run the code in a browser without installing anything. An older TensorFlow v1 version of the same models is kept in an archive folder, but the README states that only PyTorch 1.0 or higher is supported going forward, on Python 3.5 or newer. The author is Tae Hwan Jung (graykode), with an acknowledgement to mojitok for an NLP research internship.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.