flagopen/flagembedding

★ 11,674PythonAudience · developerComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((FlagEmbedding))
    What it does
      Text embeddings
      Semantic search
      Document reranking
    Model family
      BGE multilingual
      Long document models
      Lightweight variants
    Use cases
      RAG pipelines
      Vector search
      Cross-lingual retrieval
    Setup
      Python pip install
      Hugging Face models
      GPU recommended

mindmap root((FlagEmbedding)) What it does Text embeddings Semantic search Document reranking Model family BGE multilingual Long document models Lightweight variants Use cases RAG pipelines Vector search Cross-lingual retrieval Setup Python pip install Hugging Face models GPU recommended

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Add semantic search to a document collection so users can ask questions and retrieve the most relevant passages.

USE CASE 2

Build a RAG pipeline where an AI assistant first searches a knowledge base with BGE embeddings, then uses the results as context when generating an answer.

USE CASE 3

Rerank initial search results from a vector database using a BGE reranker to surface the most relevant documents at the top.

USE CASE 4

Generate multilingual embeddings across 100+ languages for a cross-lingual search or retrieval system.

Tech stack

PythonPyTorchHugging Facepip

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Python with pip, a GPU is strongly recommended for production workloads. Models download automatically from Hugging Face on first use.

Use freely for any purpose, including commercial and research use, as long as you keep the copyright notice (MIT).

In plain English

FlagEmbedding is a Python toolkit for working with text embeddings and building search and retrieval systems. It is developed by BAAI, the Beijing Academy of Artificial Intelligence. The main product line within this toolkit is a family of models called BGE (short for BAAI General Embedding), which are AI models that convert text into numerical representations. Those numerical representations can then be compared to find documents or passages that are similar in meaning to a query. The practical use case this toolkit targets is retrieval-augmented generation, often called RAG. In a RAG setup, an AI assistant answers questions by first searching a collection of documents for relevant passages, then using those passages as context when generating an answer. FlagEmbedding provides the search and retrieval layer of that pipeline: the embedding models that find the right documents, and reranker models that reorder the results to surface the most relevant ones first. The BGE model family has grown considerably since the project launched. It now includes models that handle over 100 languages, models that can process long documents (up to 8,192 tokens of input), models that work with both text and images together, and lightweight variants that use fewer computing resources. There are also reranker models that take the initial search results and score them more carefully to improve the final ranking. Installing the package requires Python and is done through pip, the standard Python package installer. Once installed, generating an embedding for a piece of text takes a few lines of code. The toolkit is also compatible with popular AI infrastructure tools. The BGE models are available for download from Hugging Face, a platform where researchers share AI models. The project maintains a leaderboard and benchmark results showing how the models compare to others on standard retrieval evaluation tests. The repository is licensed under MIT, which permits free use for both research and commercial purposes.

Copy-paste prompts

Prompt 1

Show me how to use FlagEmbedding's BGE model to embed a list of documents and a query, then find the top 3 most similar documents using cosine similarity in Python.

Prompt 2

Build a simple RAG pipeline using FlagEmbedding for retrieval and any chat LLM for generation, show the full Python code from embedding documents to answering a question.

Prompt 3

How do I use the BGE reranker from FlagEmbedding to reorder search results returned by a vector database like FAISS or Chroma?

Prompt 4

Which BGE model should I pick for a production RAG app with limited GPU memory, compare BGE-M3 and a lightweight BGE variant for embedding long documents.

Open on GitHub → Explain another repo

← flagopen on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.