explaingit

zai-org/chatglm3

13,699PythonAudience · developerComplexity · 4/5LicenseSetup · hard

TLDR

ChatGLM3 is an open-source AI chat model that understands both Chinese and English, runs on consumer hardware, and supports tool calling, code execution, and long documents up to 128K tokens.

Mindmap

mindmap
  root((ChatGLM3))
    What it does
      Bilingual chat AI
      Tool calling
      Code execution
    Model variants
      Standard 8K context
      Long context 32K
      Extra long 128K
    Tech Stack
      Python
      PyTorch
      HuggingFace
    Audience
      AI researchers
      App developers
      Chinese language users
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Run a local Chinese-English chatbot on your own machine without sending data to a cloud service

USE CASE 2

Build an AI agent that can call external APIs and execute code steps automatically using the built-in tool-calling mode

USE CASE 3

Summarize very long Chinese or English documents using the 32K or 128K context window variants

Tech stack

PythonPyTorchHuggingFaceTransformers

Getting it running

Difficulty · hard Time to first run · 1h+

Requires downloading multi-GB model weights from HuggingFace or ModelScope, a GPU is strongly recommended for reasonable response speed.

Free to use for academic research, commercial use is allowed but requires filling out a registration form with ZhipuAI before deploying.

In plain English

ChatGLM3 is an open-source conversational AI model that speaks both Chinese and English. It was built jointly by ZhipuAI and the KEG Lab at Tsinghua University. The main model in the series, ChatGLM3-6B, has 6 billion parameters and is designed to run on consumer hardware rather than requiring large data center infrastructure. The model goes beyond simple back-and-forth chat. It natively supports tool calling (where the model can invoke external functions or APIs on your behalf), code execution through a built-in interpreter, and multi-step agent tasks where it reasons through a problem in stages. The README describes a revised prompt format that makes these capabilities work without extra configuration. Four variants are available for download. The standard ChatGLM3-6B handles context windows up to 8,000 tokens, which is enough for most conversations. ChatGLM3-6B-32K extends that to 32,000 tokens, which helps with longer documents, and ChatGLM3-6B-128K pushes further still for very long-form reading tasks. A separate base model (without the chat fine-tuning) is also released for researchers who want to build on top of it. Benchmark scores show this family substantially outperformed comparably sized models on math, reasoning, and coding tests at the time of release. Long-document tasks showed average gains of over 50 percent compared to the previous generation. The weights are free to use for academic research. Commercial use is allowed after filling out a registration form. The README notes that the newer GLM-4 series has since been released and improves further on these results, so users who need the best current performance are pointed toward that newer family. To get started you clone the repository, install the Python dependencies, and then download the model weights from HuggingFace or ModelScope. A combined demo lets you switch between chat mode, tool-use mode, and code-interpreter mode in one interface. Third-party projects for faster inference on laptops, TPUs, and NVIDIA GPUs are also listed in the README.

Copy-paste prompts

Prompt 1
Write a Python script that loads ChatGLM3-6B from HuggingFace and answers questions about a long uploaded document
Prompt 2
Show me how to set up ChatGLM3's tool-calling mode so the model can look up real-time weather by invoking a function I define
Prompt 3
Write a Python script using ChatGLM3-6B to build a simple bilingual customer support chatbot that replies in the same language the user wrote in
Prompt 4
How do I run ChatGLM3-6B-32K to summarize a long meeting transcript? Show me the full inference code from loading to output
Open on GitHub → Explain another repo

← zai-org on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.