Run a large open-source chat model entirely on your own GPU hardware without paying for a commercial API.
Fine-tune one of the included language models on your own conversational dataset for a specialized assistant.
Build a chat assistant that retrieves answers from a custom document collection rather than relying on training data alone.
Requires one or more GPUs with substantial VRAM, no CPU fallback is mentioned and models are too large for consumer hardware.
OpenChatKit is a collection of tools and code for building and running open-source AI chat models. It was created by Together, a company focused on distributed AI, in collaboration with LAION and Ontocord.ai. The goal is to give developers and researchers a starting point for creating their own conversational AI systems, either for general chat or specialized applications. The repository includes code for three different language models. The largest is GPT-NeoXT-Chat-Base-20B, a 20-billion-parameter model trained specifically for conversation. There is also Pythia-Chat-Base-7B, a smaller 7-billion-parameter chat model, and support for fine-tuning Llama-2-7B-32K-beta, a model with an unusually long context window that lets it handle longer documents. All of these were trained on the OIG-43M dataset, a large collection of instructional text assembled through the collaboration. The README walks through how to get started with Pythia-Chat-Base-7B, which involves setting up a Python environment using Conda, downloading the model weights from Hugging Face, and then running a command-line chat interface. Once running, you type messages and the model responds, maintaining conversation history so it can refer back to earlier parts of the exchange. For people who want to train or fine-tune their own version of one of the models, the repository includes training scripts and step-by-step instructions. After training, a conversion script transforms the saved model weights into a format compatible with the Hugging Face ecosystem, which is a widely used standard for sharing and loading language models. An experimental feature allows the chat model to pull in up-to-date information from a custom document index during conversations, so the model's responses can include content from sources beyond its original training data. All of this requires substantial computing hardware, typically one or more GPUs. The project is open source.
← togethercomputer on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.