explaingit

redpanda-data/connect

8,669GoAudience · ops devopsComplexity · 3/5Setup · moderate

TLDR

A data pipeline tool that moves and transforms messages between systems like Kafka, AWS, and databases using a single YAML config file, with no custom code required for most integrations.

Mindmap

mindmap
  root((Redpanda Connect))
    What it does
      Data pipelines
      Stream processing
      Message routing
    Tech Stack
      Go binary
      YAML config
      Bloblang
      Docker
    Supported Sources
      Kafka
      AWS SQS S3
      MQTT Redis
    Features
      At-least-once delivery
      Prometheus metrics
      Custom Go plugins
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Stream messages from an SQS queue, transform them with Bloblang, and write results to PostgreSQL.

USE CASE 2

Build a real-time pipeline from Kafka to Elasticsearch without writing Go code.

USE CASE 3

Run a self-contained binary that forwards MQTT messages to a cloud storage bucket with at-least-once delivery.

Tech stack

GoYAMLDockerBloblang

Getting it running

Difficulty · moderate Time to first run · 30min

Requires access to at least one external system (Kafka, AWS, database) to run a meaningful pipeline.

In plain English

Redpanda Connect is a data pipeline tool that moves and transforms data between different systems. If you have data arriving from one place, say a message queue or a cloud storage bucket, and you need to clean it up, reformat it, or enrich it before sending it somewhere else, this tool handles that flow. It describes itself as a stream processor: it takes in a continuous stream of messages, applies whatever transformations you configure, and passes the results to an output. Configuration is done in a single YAML file. You specify an input (where data comes from), a list of processing steps (what to do to each message), and an output (where the result goes). The tool includes a built-in mapping language called Bloblang for writing transformations, which lets you reshape and compute values from each message without writing custom code in a programming language. The list of supported sources and destinations is long: AWS services like S3, SQS, Kinesis, and DynamoDB, Azure and GCP storage and messaging services, Kafka, Redis, RabbitMQ, MQTT, MongoDB, PostgreSQL and MySQL, Elasticsearch, and more. This breadth means you can often connect two systems you already use without writing a custom integration from scratch. Reliability is a stated design goal. Redpanda Connect processes messages using an in-process transaction model that does not require disk-persisted state, so it can guarantee at-least-once delivery even if the process crashes. It also exposes health check endpoints and emits metrics compatible with Prometheus and Statsd, making it straightforward to monitor in a production environment. It can run as a static binary on Linux or Mac, via Homebrew, or as a Docker image. Custom plugins can be written in Go for cases where the built-in processors do not cover a specific need. Full documentation is available on the Redpanda documentation site.

Copy-paste prompts

Prompt 1
Write a Redpanda Connect YAML config that reads from an AWS SQS queue, extracts a JSON field called 'event_type', and writes matching messages to a PostgreSQL table.
Prompt 2
How do I use Bloblang in Redpanda Connect to rename fields and drop null values before sending to an output?
Prompt 3
Create a Redpanda Connect pipeline that consumes from a Kafka topic, deduplicates messages by ID, and forwards them to Redis.
Prompt 4
How do I run Redpanda Connect as a Docker container with a custom YAML config mounted as a volume and Prometheus metrics enabled?
Open on GitHub → Explain another repo

← redpanda-data on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.