Find and download a pretrained Chinese BERT or MacBERT model for a text classification or NLP task
Discover domain-specific Chinese LLMs fine-tuned for finance, medicine, or law
Compare Chinese NLP models by size, architecture, and task type in one place without hunting across sites
Find Chinese instruction datasets and evaluation benchmarks to test model performance
This repository is a curated index of pretrained Chinese natural language processing models. It collects links, descriptions, and download information for publicly released models that work with Chinese text, organized so researchers and developers can find what they need without having to track down each model individually. The index is split into two broad sections. The first covers large language models, the kind used for chat, reasoning, and general-purpose text generation. These are grouped by type: general-purpose base models with more than 7 billion parameters, domain-specific base models for fields like finance, medicine, and law, general chat models, domain-specific chat models, multimodal models that handle both images and text, and reasoning-focused models for mathematics and logic. Each entry lists the model name, size, release date, supported languages, architecture type, and links to the HuggingFace repository and original project. The second section covers older pretrained models in the BERT family and related architectures. These are organized into NLU models for understanding tasks like classification and question answering, NLG models for generation tasks, combined NLU-NLG models, and multimodal models. The 29 NLU entries include Chinese versions of BERT, RoBERTa, ALBERT, ERNIE, MacBERT, and ELECTRA. The 18 NLG entries include GPT, T5, BART, and CPM variants. The repository also links to evaluation benchmarks for comparing models, open-source model platforms, Chinese instruction datasets, and embedding models. The full README is longer than what was shown.
← lonepatient on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.