Study Transformer architecture, attention mechanisms, and models like LLaMA and ChatGLM to prepare for an AI engineering interview
Learn how RLHF fine-tuning and distributed training strategies work for large-scale language model development
Understand RAG systems and LLM hallucination causes from both a technical and interview-question perspective
This repository is a study guide and interview question collection for engineers working on large language models (LLMs). The content is written in Chinese and is aimed at people preparing for technical job interviews at AI companies, or those who want to build a structured understanding of how modern language models are built and deployed. The notes cover the full lifecycle of a language model in roughly ten topic areas. The first covers foundational concepts: how text is broken into tokens, how words are represented as numeric vectors, and how classic model components like BERT work. The second goes deeper into model architecture, including the Transformer design that most modern LLMs are based on, various attention mechanisms, and specific models like LLaMA and ChatGLM. A section on training covers distributed training strategies, which are needed when training on large clusters of hardware. Later sections address fine-tuning (how to adapt a general model to a specific task), inference frameworks (tools used to run the model efficiently once training is done), and reinforcement learning from human feedback (RLHF), which is a technique used to align model outputs with human preferences. There is also a section on retrieval-augmented generation (RAG), where the model consults an external database during generation to improve accuracy. The guide includes a section on LLM hallucinations: cases where the model produces confident-sounding but incorrect output. This is covered from both a technical and practical interview-question angle. The author also links several companion repositories for hands-on practice, including a small Chinese-language LLM built from scratch, a simple RAG system, and an implementation of the LLaMA 3 architecture. These are separate projects for those who want to move from reading notes to running experiments.
← wdndev on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.