Ask finance questions and get answers that include the AI's step-by-step reasoning so you can verify the logic, not just the conclusion.
Build a finance chatbot that plugs into existing OpenAI-compatible front-ends with minimal changes.
Fine-tune a compact AI model on your own finance documents and deploy it locally without cloud API costs.
Create a document search index from financial reports and query them with an AI that retrieves relevant passages before answering.
Requires an NVIDIA RTX 4060 or equivalent GPU with 8GB VRAM, needs Python deps installed and a data prep script run before first use.
Finance-DeepSeek is a question-answering system built for finance topics. It runs on a single consumer GPU (an RTX 4060 with 8GB of memory) and gives answers that include the reasoning steps the model took to reach them, not just a final answer. The README is written in Chinese. The system is built on top of DeepSeek-R1-Distill-Qwen-1.5B, which is a compact AI model derived from a much larger 671-billion-parameter model through a process called knowledge distillation. The idea is that the smaller model inherits some of the larger model's reasoning behavior without requiring the same hardware. The base model is downloaded automatically from HuggingFace on first run. Two techniques work together to improve answer quality. The first is QLoRA, a method for fine-tuning the model on finance-specific data without needing a lot of GPU memory. The second is RAG (retrieval-augmented generation), where relevant documents are searched and fed into the prompt before the model generates an answer. The document index is built using FAISS and a financial text embedding model. Users can choose between three modes: answering from the model alone, answering with retrieved context, or answering with retrieved context plus structured reasoning output. The model's responses often contain a thinking section (wrapped in think tags) before the final answer. The system parses this automatically and can stream both parts back to the caller in order, so a front-end can display the reasoning as it arrives. The API follows the OpenAI chat completions format, so tools built for OpenAI's API can talk to it with minimal changes. Setup involves cloning the repository, installing Python dependencies, and running a data preparation script that generates training data and builds the vector index. Optional fine-tuning can be run locally. A Docker Compose configuration is also included for containerized deployment. The project is MIT licensed.
← shaneliu04 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.