explaingit

prakhar2295/ai_fraud_detection_chatbot

0PythonAudience · developerComplexity · 4/5ActiveSetup · hard

TLDR

Voice AI pipeline for banking fraud detection that transcribes calls, runs them through a local LLM and LangGraph workflow, then replies via Piper TTS.

Mindmap

mindmap
  root((FraudVoiceAI))
    Inputs
      WebSocket PCM16 audio
      Microphone stream
      Session state
    Outputs
      Transcript updates
      Fraud risk score
      Spoken AI reply
    Use Cases
      Detect scam calls
      Voice agent demo
      Persistent fraud memory
    Tech Stack
      Python
      LangGraph
      Ollama
      Piper
      ChromaDB

Things people build with this

USE CASE 1

Prototype a voice agent that flags banking fraud during a call

USE CASE 2

Build a streaming STT to LLM to TTS pipeline with LangGraph

USE CASE 3

Store fraud patterns in ChromaDB for cross-session retrieval

Tech stack

PythonFastAPILangGraphOllamaPiperChromaDBUvicorn

Getting it running

Difficulty · hard Time to first run · 1h+

Needs Python 3.14, a running Ollama instance, and a Piper CLI on PATH for real voice output; falls back to placeholder audio otherwise.

In plain English

This repository contains a Banking Fraud Detection Voice AI system written in Python. The aim, as described in the README, is to take a voice conversation, transcribe it, run it through an AI model to look for signs of banking fraud, and reply back to the caller using a synthesized voice. The whole thing is split into multiple phases, each with its own backend README, and the main page focuses on Phase 4 and Phase 7. In Phase 1, the project began as an offline pipeline: a WAV audio file was passed through speech-to-text, the text was sent to a local large language model running in Ollama, and the model produced a fraud reasoning response. Phase 2 added microphone input and near-realtime streaming. Phase 3 introduced a graph-based workflow using LangGraph, which is a library for orchestrating steps in a deterministic order, covering intent detection, fraud analysis, risk scoring, and memory. Phase 4, the main focus of this README, adds spoken replies. A new TTS layer based on Piper, an open-source text-to-speech engine, turns the AI's response into audio. There is a queue-safe playback manager, turn management to track who is speaking, and the start of interruption support so the user can talk over the bot. A new conversation coordinator wires together speech-to-text, the LangGraph workflow, and the TTS playback. Phase 7 adds long-term memory and vector-based retrieval of past fraud patterns using ChromaDB, with a fallback to an in-memory store when ChromaDB is not installed. The README documents how to install dependencies from requirements.txt, run the server with Uvicorn on port 8000, connect a WebSocket client at /ws/voice/<session_id>, and send PCM16 audio frames along with control messages like flush, stop, and ping.

Copy-paste prompts

Prompt 1
Walk me through setting up Ollama and Piper to run this fraud chatbot locally
Prompt 2
Show me how ConversationCoordinator wires STT, LangGraph and TTS together
Prompt 3
Add a new LangGraph node that checks the caller against a deny list
Prompt 4
Write a small Python WebSocket client that streams a WAV file to /ws/voice and prints responses
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.