explaingit

tencent-hunyuan/hy-mt2

76PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

Tencent's family of fast-thinking translation models (1.8B, 7B, 30B-A3B MoE) covering 33 languages, with on-device 1.25-bit quantization and the IFMTBench instruction-following benchmark.

Mindmap

mindmap
  root((Hy-MT2))
    Inputs
      Source text
      Target language
      Glossary terms
      Style or preferences
    Outputs
      Translated text
      Structured data translation
      Benchmark scores
    Use Cases
      Multilingual document translation
      On-device translation app
      Glossary-aware localization
      Translation evaluation
    Tech Stack
      Python
      PyTorch
      Hugging Face
      GGUF
      llama.cpp
      AngelSlim
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Run a self-hosted translation server across 33 languages with the 7B or 30B-A3B model

USE CASE 2

Ship on-device translation on a phone using the 440MB 1.25-bit GGUF build

USE CASE 3

Translate JSON or structured payloads while preserving keys, tags, and placeholders

USE CASE 4

Benchmark a translation system's instruction following with IFMTBench

Tech stack

PythonPyTorchGGUFllama.cppAngelSlim

Getting it running

Difficulty · hard Time to first run · 1h+

Larger 7B and 30B-A3B models need substantial GPU memory, the 1.8B GGUF is the quickest path on a laptop.

In plain English

Hy-MT2 is Tencent's open-source release of a family of machine translation models. The README describes them as fast-thinking multilingual translation models, meaning they aim to give a direct answer quickly rather than write out long reasoning before replying. The family comes in three sizes: a 1.8 billion parameter model, a 7 billion model, and a 30 billion mixture-of-experts model labelled 30B-A3B. All three support translation between 33 languages and accept translation instructions in several languages. A notable detail is the on-device build of the small model. Using a separate Tencent project called AngelSlim, the 1.8B model is squashed down to 1.25-bit quantization, which shrinks the file to about 440 megabytes and runs roughly 1.5 times faster than the unquantized version. The repository's model list links to several formats on Hugging Face, including FP8 versions for fast servers and GGUF files in 2-bit and 1.25-bit variants for use with llama.cpp on local machines. The README reports that the 7B and 30B-A3B models score higher than open-source competitors such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, and that the 1.8B model beats mainstream commercial translation APIs from Microsoft and Doubao on average. The detailed numbers and analysis are in an attached PDF report. Alongside the models, Tencent released a benchmark called IFMTBench for measuring how well a translation model follows instructions. The README also lists prompt templates for typical translation scenarios. There are templates for plain translation, translation with a glossary of preferred terms, translation in a specific style, personalised translation with extra user preferences, translation that must preserve delimiters exactly, and structured-data translation that touches only user-facing text in JSON or similar formats while leaving keys, code tags, and placeholders alone. Both Chinese and English versions of each prompt are shown side by side. For people who do not want to call the models directly, the team publishes a Hy-MT2-Translator Skill on ClawHub and SkillHub. The project also announces a partnership with the WMT26 conference: teams using Hy-MT models in the general translation and video subtitle tasks can win special awards sponsored by Hunyuan. The repository contains a Chinese-language README as well.

Copy-paste prompts

Prompt 1
Set up Hy-MT2-1.8B-GGUF in llama.cpp and run a Chinese to English translation with the default prompt template
Prompt 2
Compare Hy-MT2-7B-FP8 to Hy-MT2-30B-A3B-FP8 on a sample of my own documents using vLLM
Prompt 3
Wire the Hy-MT2 terminology prompt template into a glossary-driven CAT-tool style script for product strings
Prompt 4
Quantize Hy-MT2-1.8B to 1.25-bit using AngelSlim and measure the speedup vs the FP16 build on my hardware
Prompt 5
Run the IFMTBench from Hy-MT2 against an OpenAI translation prompt and report the gap
Open on GitHub → Explain another repo

← tencent-hunyuan on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.