google-research/bert

Analysis updated 2026-05-18

★ 40,001PythonAudience · researcherComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((repo))
    What it does
      Bidirectional text understanding
      Pre-trained language model
      Fine-tune for tasks
    How it works
      Masked word prediction
      Sentence relationship learning
      Transfer to new tasks
    Use cases
      Question answering
      Sentiment analysis
      Text classification
    Tech stack
      Python
      TensorFlow
    Models included
      BERT-Base
      BERT-Large
      Multilingual variants
    Audience
      NLP researchers
      Practitioners

mindmap root((repo)) What it does Bidirectional text understanding Pre-trained language model Fine-tune for tasks How it works Masked word prediction Sentence relationship learning Transfer to new tasks Use cases Question answering Sentiment analysis Text classification Tech stack Python TensorFlow Models included BERT-Base BERT-Large Multilingual variants Audience NLP researchers Practitioners

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Fine-tune BERT on your own text classification dataset to categorize customer feedback or product reviews.

USE CASE 2

Build a question-answering system by fine-tuning BERT on labeled question-answer pairs from your domain.

USE CASE 3

Analyze sentiment in social media posts or customer comments by adapting BERT to your specific sentiment labels.

USE CASE 4

Extract named entities or perform other NLP tasks by fine-tuning the pre-trained model on your labeled text data.

What is it built with?

PythonTensorFlow

How does it compare?

	google-research/bert	agno-agi/agno	vnpy/vnpy
Stars	40,001	39,947	40,156
Language	Python	Python	Python
Setup difficulty	moderate	moderate	moderate
Complexity	3/5	4/5	4/5
Audience	researcher	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires TensorFlow installation and downloading pre-trained BERT weights, GPU optional but recommended for inference speed.

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

BERT stands for Bidirectional Encoder Representations from Transformers. This repository is Google Research's official release of the TensorFlow code and pre-trained model weights for BERT, a natural language processing model that changed how machines understand text. The problem BERT solved is that earlier text models read sentences either left to right or right to left, missing the full context of a word in relation to everything around it. BERT reads the entire sentence in both directions simultaneously, giving it a much richer understanding of what each word means in context. The way it works is that BERT was pre-trained on a massive amount of text using two tasks: predicting randomly masked words in a sentence (which forces the model to understand context from both sides), and predicting whether one sentence logically follows another. After pre-training, BERT can be fine-tuned on a specific task, such as question answering, sentiment analysis, or text classification, by training it a bit more on a smaller labeled dataset for that task. This fine-tuning approach works remarkably well, letting a single large pre-trained model be adapted to many different language understanding tasks with relatively little additional data. This repository provides the pre-trained BERT-Base and BERT-Large models in both cased and uncased variants, as well as multilingual models, plus the code to fine-tune them on downstream tasks. You would use this repository if you are an NLP researcher or practitioner who wants to fine-tune BERT on your own text classification, question answering, or other language tasks, or if you want to study the original implementation. The tech stack is Python with TensorFlow.

Copy-paste prompts

Prompt 1

Show me how to load a pre-trained BERT model from this repo and fine-tune it on a custom text classification dataset.

Prompt 2

I have a question-answering dataset. Walk me through the code in this repo to fine-tune BERT for QA tasks.

Prompt 3

Explain the masked language modeling task that BERT uses during pre-training and why it helps the model understand context.

Prompt 4

How do I use the multilingual BERT models from this repo to classify text in languages other than English?

Prompt 5

Show me the code to tokenize text and prepare it as input for BERT fine-tuning on a sentiment analysis task.

Frequently asked questions

What is bert?

Google's BERT model reads text in both directions at once to understand word meaning in context, then fine-tunes on specific language tasks like question answering or sentiment analysis.

What language is bert written in?

Mainly Python. The stack also includes Python, TensorFlow.

What license does bert use?

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

How hard is bert to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is bert for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub google-research on gitmyhub

Verify against the repo before relying on details.