Build a real-time event processor in Python that reads from a Kafka topic and reacts to each message as it arrives rather than in batches.
Create a distributed key/value store in your stream app that tracks windowed counts like clicks in the last hour with automatic expiry.
Process live event streams using familiar Python libraries like Pandas or NumPy alongside the stream processing logic.
Replace a scheduled batch job with a continuously running pipeline that updates results in real time.
Requires a running Apache Kafka cluster, this repository is deprecated, use the community faust-streaming fork for new projects.
Faust is a Python library that lets developers build systems that process continuous streams of data, reading events as they arrive rather than working on batches after the fact. It was built by Robinhood and used internally to handle billions of events per day across distributed systems and real-time data pipelines. The library is now deprecated and no longer maintained by Robinhood, an active community-maintained fork continues at a separate GitHub repository. The core idea comes from Kafka Streams, a Java-based stream processing tool, but Faust brings that approach to plain Python. You connect it to Apache Kafka, a messaging system that acts as a high-throughput queue, and then write ordinary Python functions that react to each incoming message. Because it uses Python's async features, those functions can also make web requests or run other background work without blocking the stream. Faust includes a built-in distributed key/value store called Tables. These work like Python dictionaries in your code, but the data is stored on disk using RocksDB (a fast embedded database) and replicated across all nodes in your cluster. If one machine fails, another picks up where it left off automatically. Tables also support time-based windowing, so you can track counts like "clicks in the last hour" and let older windows expire on their own. Because it is just Python, Faust works alongside any library you already use: NumPy, Pandas, Django, Flask, or anything else. Models describe how messages are serialized, using Python type annotations to define the shape of expected data. The library is statically typed and works with the mypy type checker, which can catch errors before you run anything. Faust requires Python 3.6 or later. Given the deprecation notice, new projects should consider the community fork rather than this repository.
← robinhood on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.