explaingit

okfn-brasil/serenata-de-amor

4,589PythonAudience · researcherComplexity · 3/5Setup · moderate

TLDR

A Brazilian civic tech project that uses machine learning to scan politicians' expense claims and automatically flag suspicious spending for public scrutiny.

Mindmap

mindmap
  root((serenata-de-amor))
    What it does
      Scans expense claims
      Flags suspicious spending
      Posts findings publicly
    Components
      Rosie ML analyzer
      Jarbas web browser
      Community notebooks
    Data sources
      Government open data
      Chamber of Deputies
      Federal Senate
    Audience
      Citizens
      Journalists
      Civic developers
    Tech
      Python
      Machine learning
      Cloud hosting
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Browse flagged government expense claims through the Jarbas web interface to investigate questionable spending by Brazilian lawmakers.

USE CASE 2

Run Rosie to analyze a fresh batch of public expense data and automatically surface statistically unusual claims.

USE CASE 3

Extend the notebooks with custom analysis to find spending patterns across legislators, regions, or spending categories.

USE CASE 4

Use this project as a reference architecture for building civic transparency tools that combine open government data with ML anomaly detection.

Tech stack

PythonMachine LearningJupyter Notebooks

Getting it running

Difficulty · moderate Time to first run · 1h+

Requires downloading large government datasets and configuring cloud credentials to run Rosie's analysis pipeline.

Open source civic tech project, check repository for license, contributions welcome via Discord.

In plain English

Operacao Serenata de Amor is a Brazilian civic technology project that uses machine learning to scan public records of government spending and flag suspicious expenses. The main focus is on the expense claims submitted by members of the Brazilian Chamber of Deputies and Federal Senate, the lawmakers who represent the public. The project makes this data accessible and understandable for ordinary citizens. The heart of the project is a system called Rosie, an automated tool that analyzes expense data and identifies claims that look unusual, such as a meal receipt at an implausible price or a travel expense for a destination that does not match the legislator's schedule. Rosie has a Twitter account where findings get posted publicly. A separate web tool called Jarbas lets anyone browse these expenses and see which ones have been flagged, serving as the interface where citizens can investigate further and, if they choose, contact their representatives about questionable spending. The technical side is built in Python and hosted on cloud servers. The expense data comes from open government data portals. Rosie is run manually about once a month, and the Jarbas website stays online continuously so the public can always access the data. The codebase is split across several GitHub repositories: this main one holds Rosie and Jarbas, while a separate installable package handles dataset generation and a collection of community notebooks contains exploratory analysis. The project was founded in 2016 by a small team and grew with contributions from the open-source and civic tech communities in Brazil. It is part of the Data Science for Civic Innovation Programme run by Open Knowledge Brasil, an organization that promotes open data and civic participation. The README notes that the project is no longer receiving frequent updates, as the team has moved energy toward other initiatives. People interested in active collaboration are directed to a related project called Querido Diario, which works with official government publications. Contributions to fix bugs or make improvements are still welcome through the project's Discord.

Copy-paste prompts

Prompt 1
I want to build a tool like Rosie that flags suspicious expense claims. Using Python and scikit-learn, show me how to train an anomaly detection model on a CSV of government expenses with columns: legislator, category, amount, date.
Prompt 2
Given a pandas DataFrame of public expense records with columns for amount and category, write a Python function that identifies statistical outliers using IQR and returns flagged rows with a reason string.
Prompt 3
I'm building a civic transparency web app. Show me how to set up a simple Django or Flask backend that reads from a Postgres database of flagged expenses and serves a paginated JSON API.
Prompt 4
Write a Python script that fetches expense data from a government open data API, cleans column names, and saves it as a CSV file ready for ML analysis.
Prompt 5
Show me how to post an automated tweet using Python Tweepy when a new suspicious expense is detected, including the legislator name, category, and flagged amount.
Open on GitHub → Explain another repo

← okfn-brasil on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.