explaingit

paddlepaddle/ernie

7,720PythonAudience · researcherComplexity · 4/5LicenseSetup · hard

TLDR

Baidu's open-source ERNIE 4.5 family of large AI models for text and image understanding, ranging from a tiny 0.3B model to a 424B model, with tools for fine-tuning on your own data and running cookbooks for common use cases.

Mindmap

mindmap
  root((ERNIE 4.5))
    What it is
      Baidu open LLM family
      Text and vision models
    Model Sizes
      0.3B small
      Mid-range options
      424B largest
    Capabilities
      Text generation
      Image understanding
      128K context window
    Tooling
      ERNIEKit fine-tuning
      FastDeploy serving
      Jupyter cookbooks
    Platforms
      Hugging Face
      Baidu AI Studio
      PaddlePaddle
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Build a chatbot powered by ERNIE 4.5 that can answer questions over a private document knowledge base.

USE CASE 2

Fine-tune a smaller ERNIE model on your own dataset to create a specialized text-classification or extraction tool.

USE CASE 3

Add web search capability to a conversational AI system using the provided cookbook as a starting point.

USE CASE 4

Run optical character recognition on document images using ERNIE's multimodal vision-language model variants.

Tech stack

PythonPaddlePaddleHugging Facepip

Getting it running

Difficulty · hard Time to first run · 1h+

Large models require significant GPU memory, the 424B model needs multi-GPU infrastructure, though the 0.3B version runs on modest hardware.

Open-source license, model weights are publicly available for download and use, specific license terms not named in the description.

In plain English

This repository is the official home for ERNIE 4.5, a family of large AI language and vision models developed by Baidu. These models can understand and generate text, and the vision variants can also process images and video. ERNIE 4.5 is a competitor to models like GPT-4 and other large language models. Baidu has released the model weights publicly under an open-source license so anyone can download and run them. The family includes 10 different model sizes and types. Some are text-only and some are multimodal, meaning they accept images or video alongside text. The largest model has 424 billion total parameters, which is an indicator of raw capacity, while a smaller 0.3 billion parameter version is available for use on less powerful hardware. The models support a context window of 128,000 tokens, meaning they can process very long documents in one go. The repository also includes ERNIEKit, a toolkit for training and fine-tuning ERNIE models on your own data. Fine-tuning means taking a pre-trained model and continuing to train it on a smaller dataset to specialize it for a particular task. ERNIEKit supports several training approaches including supervised fine-tuning, preference optimization, and quantization-aware training. A separate project called FastDeploy handles running the models in production with high performance. A set of cookbooks is provided showing practical examples: building a chatbot, adding web search to a conversation, building a question-answering system from a private knowledge base, and recognizing text in documents. These are interactive notebooks designed to walk through each use case step by step. All models are also available on Hugging Face and Baidu's AI Studio platform.

Copy-paste prompts

Prompt 1
Using the ERNIE 4.5 cookbook, show me the Python code to build a retrieval-augmented question-answering system from a folder of PDF files using ERNIEKit.
Prompt 2
I want to fine-tune the ERNIE 4.5 0.3B model on my own text classification dataset. Walk me through the ERNIEKit supervised fine-tuning command and the data format it expects.
Prompt 3
Show me how to load the ERNIE 4.5 multimodal model from Hugging Face and run inference on a local image file with a text question about what's in the image.
Prompt 4
I need to run ERNIE 4.5 in production. What does FastDeploy provide and how do I use it to serve the model as an HTTP API endpoint?
Open on GitHub → Explain another repo

← paddlepaddle on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.