tencent-hunyuan/hy-mt2

★ 76PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((Hy-MT2))
    Inputs
      Source text
      Target language
      Glossary terms
      Style or preferences
    Outputs
      Translated text
      Structured data translation
      Benchmark scores
    Use Cases
      Multilingual document translation
      On-device translation app
      Glossary-aware localization
      Translation evaluation
    Tech Stack
      Python
      PyTorch
      Hugging Face
      GGUF
      llama.cpp
      AngelSlim

mindmap root((Hy-MT2)) Inputs Source text Target language Glossary terms Style or preferences Outputs Translated text Structured data translation Benchmark scores Use Cases Multilingual document translation On-device translation app Glossary-aware localization Translation evaluation Tech Stack Python PyTorch Hugging Face GGUF llama.cpp AngelSlim

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run a self-hosted translation server across 33 languages with the 7B or 30B-A3B model

USE CASE 2

Ship on-device translation on a phone using the 440MB 1.25-bit GGUF build

USE CASE 3

Translate JSON or structured payloads while preserving keys, tags, and placeholders

USE CASE 4

Benchmark a translation system's instruction following with IFMTBench

Tech stack

PythonPyTorchGGUFllama.cppAngelSlim

Getting it running

Difficulty · hard Time to first run · 1h+

Larger 7B and 30B-A3B models need substantial GPU memory, the 1.8B GGUF is the quickest path on a laptop.

In plain English

Hy-MT2 is Tencent's open-source release of a family of machine translation models. The README describes them as fast-thinking multilingual translation models, meaning they aim to give a direct answer quickly rather than write out long reasoning before replying. The family comes in three sizes: a 1.8 billion parameter model, a 7 billion model, and a 30 billion mixture-of-experts model labelled 30B-A3B. All three support translation between 33 languages and accept translation instructions in several languages. A notable detail is the on-device build of the small model. Using a separate Tencent project called AngelSlim, the 1.8B model is squashed down to 1.25-bit quantization, which shrinks the file to about 440 megabytes and runs roughly 1.5 times faster than the unquantized version. The repository's model list links to several formats on Hugging Face, including FP8 versions for fast servers and GGUF files in 2-bit and 1.25-bit variants for use with llama.cpp on local machines. The README reports that the 7B and 30B-A3B models score higher than open-source competitors such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, and that the 1.8B model beats mainstream commercial translation APIs from Microsoft and Doubao on average. The detailed numbers and analysis are in an attached PDF report. Alongside the models, Tencent released a benchmark called IFMTBench for measuring how well a translation model follows instructions. The README also lists prompt templates for typical translation scenarios. There are templates for plain translation, translation with a glossary of preferred terms, translation in a specific style, personalised translation with extra user preferences, translation that must preserve delimiters exactly, and structured-data translation that touches only user-facing text in JSON or similar formats while leaving keys, code tags, and placeholders alone. Both Chinese and English versions of each prompt are shown side by side. For people who do not want to call the models directly, the team publishes a Hy-MT2-Translator Skill on ClawHub and SkillHub. The project also announces a partnership with the WMT26 conference: teams using Hy-MT models in the general translation and video subtitle tasks can win special awards sponsored by Hunyuan. The repository contains a Chinese-language README as well.

Copy-paste prompts

Prompt 1

Set up Hy-MT2-1.8B-GGUF in llama.cpp and run a Chinese to English translation with the default prompt template

Prompt 2

Compare Hy-MT2-7B-FP8 to Hy-MT2-30B-A3B-FP8 on a sample of my own documents using vLLM

Prompt 3

Wire the Hy-MT2 terminology prompt template into a glossary-driven CAT-tool style script for product strings

Prompt 4

Quantize Hy-MT2-1.8B to 1.25-bit using AngelSlim and measure the speedup vs the FP16 build on my hardware

Prompt 5

Run the IFMTBench from Hy-MT2 against an OpenAI translation prompt and report the gap

Open on GitHub → Explain another repo

← tencent-hunyuan on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.