Quickly find the right open-source library for a specific LLM task such as RAG, fine-tuning, or structured output.
Use as a reference checklist when deciding which components to include in a new LLM application stack.
Discover lesser-known tools in categories like synthetic data generation, LLM safety, or prompt engineering.
This repository is a curated directory of more than 120 open-source libraries organized by category, all related to building applications with large language models (LLMs). LLMs are the type of AI system behind tools like ChatGPT: they generate, summarize, or respond to text. The toolkit is for engineers who want to know which library to reach for at each stage of building an LLM-based product. The categories cover the full development cycle: training and fine-tuning models on custom data, building applications that use LLMs, retrieving information to augment responses (a technique called RAG), running models efficiently on hardware, serving them at scale to users, extracting structured data from text, generating synthetic training data, building autonomous AI agents, evaluating model quality, monitoring deployed systems, constructing prompts, enforcing structured output formats, and handling safety and security concerns. Each entry in the directory gives the library's name, a one-line description, and a link to its GitHub repository. The collection does not contain tutorials or code examples of its own, it is purely a reference list. The maintainer also runs a free newsletter called AIxFunda and links to related repositories covering LLM interview questions, prompt engineering techniques, and a collection of survey papers on LLMs and related research areas. The repository is maintained by Kalyan KS, who is active on LinkedIn, X (formerly Twitter), and YouTube under the same name. The full README is longer than what was shown.
← kalyanks-nlp on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.