moonshotai/kimi-k2

★ 10,765Audience · researcherComplexity · 5/5LicenseSetup · hard

Mindmap

mindmap
  root((Kimi K2))
    Architecture
      Mixture of experts
      1T total params
      32B active params
    Training
      15.5T tokens
      MuonClip optimizer
      128K context
    Versions
      Base model
      Instruct model
    Use cases
      Agentic tasks
      Tool use
      Code generation
    Audience
      AI researchers
      ML engineers

mindmap root((Kimi K2)) Architecture Mixture of experts 1T total params 32B active params Training 15.5T tokens MuonClip optimizer 128K context Versions Base model Instruct model Use cases Agentic tasks Tool use Code generation Audience AI researchers ML engineers

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Evaluate Kimi K2 against other frontier AI models on coding and tool-use benchmarks for a model selection decision.

USE CASE 2

Fine-tune the Kimi K2 base model weights on domain-specific data to build a specialized AI application.

USE CASE 3

Deploy the instruct model as an AI agent that can plan and use tools across multiple steps to complete complex tasks.

Tech stack

PythonHugging Face

Getting it running

Difficulty · hard Time to first run · 1day+

Requires substantial GPU infrastructure to run locally, most users will access the model via Hugging Face Inference API or a cloud provider.

Available under a modified MIT license, review the specific terms on the Hugging Face model page before commercial deployment.

In plain English

Kimi K2 is a large language model released by Moonshot AI, a Chinese AI research company. This repository is the official release page for the model, containing documentation, benchmark results, and links to download model weights. The model itself is not code you run locally in any typical sense, it is a very large AI system that requires significant computing infrastructure to deploy. The architecture is what is called a mixture-of-experts model, which means it has 1 trillion total parameters but only activates 32 billion of them for any given input. This design lets the model remain computationally tractable despite its scale. It was trained on 15.5 trillion tokens of text using a custom optimizer the team developed called MuonClip. The model supports a context window of 128,000 tokens, meaning it can process very long documents in a single request. Two versions are released: the base model for researchers who want to fine-tune it for specific applications, and an instruct model that has been further trained to follow instructions and is suited for general chat and agentic use. The instruct version is described as a reflex-grade model, meaning it responds directly without an extended reasoning step. The design emphasis is on agentic tasks, which means tasks where the model needs to use tools, plan across steps, and act on its own to reach a goal rather than just answering a single question. Benchmark comparisons in the README show the model performing competitively against other frontier models on coding and tool-use tasks. The model weights are available on Hugging Face under a modified MIT license. This repository is primarily of interest to AI researchers, ML engineers, and teams evaluating large language models for deployment in agentic or coding-focused applications.

Copy-paste prompts

Prompt 1

I want to load the Kimi K2 instruct model from Hugging Face and run a quick inference test. Show me the Python code using the transformers library, assuming I have enough GPU memory.

Prompt 2

How does Kimi K2's mixture-of-experts architecture work, and what does it mean that only 32 billion of its 1 trillion parameters are active at a time? Explain in plain terms.

Prompt 3

I want to use Kimi K2 as an agentic AI that can call tools. Show me how to set up a simple tool-use loop using the transformers library and a custom Python function as a tool.

Prompt 4

What hardware do I need to run Kimi K2 locally? Give me a realistic spec for running the instruct model at reasonable speed.

Prompt 5

Compare Kimi K2 to other mixture-of-experts models like Mixtral or DeepSeek for coding tasks. What does the benchmark data in the model card show?

Open on GitHub → Explain another repo

← moonshotai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.