explaingit

humancompatibleai/pareto

Analysis updated 2026-05-18

3Jupyter NotebookAudience · researcherComplexity · 2/5Setup · easy

TLDR

A research dataset from a large-scale human study measuring whether AI responses to political questions receive balanced approval across political viewpoints, with Jupyter notebooks to reproduce the paper's figures.

Mindmap

mindmap
  root((repo))
    What it is
      Research dataset
      AI neutrality study
      Human evaluation
    Data Included
      Model responses CSV
      Survey ratings
      Participant demographics
    Analysis Code
      Jupyter notebooks
      PCA and correlations
      Paper figures
    Research Context
      Multiple AI models
      Prolific participants
      Political questions
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Reproduce the figures and statistical analysis from the PARETO paper on AI political neutrality

USE CASE 2

Study how different demographic groups rated AI responses to political questions using the survey and demographic CSV data

USE CASE 3

Use the PARETO dataset as a benchmark or comparison baseline in new research on AI political bias or neutrality

What is it built with?

PythonJupyter Notebookpandas

How does it compare?

humancompatibleai/paretoabdurrafey237/rag-chatbotjamisriram/academic-rag-assistant
Stars330
LanguageJupyter NotebookJupyter NotebookJupyter Notebook
Setup difficultyeasymoderateeasy
Complexity2/53/52/5
Audienceresearchergeneraldeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

Run the Jupyter notebooks to reproduce figures, standard Python data science packages needed such as pandas, matplotlib, and scikit-learn.

In plain English

PARETO is a research dataset released alongside an academic paper studying whether AI systems can respond to politically sensitive questions in a way that feels fair to people across the political spectrum. The central question the paper explores is what political neutrality actually means for an AI: not simply avoiding strong opinions, but producing answers that people on different sides of political issues find roughly equally acceptable. The dataset contains responses from multiple AI models to a set of politically framed questions. Each response was shown to a large group of human survey participants (recruited through the Prolific research platform) who rated how much they approved of what the AI said. The survey also collected demographic information and written qualitative feedback. Prolific IDs were hashed to a consistent anonymized identifier to protect participant privacy while keeping responses linkable across the data files. The repository is organized into four main areas: raw AI model responses stored as CSV files, the same response pairings formatted as PNG images for display in the survey interface, the survey results (including numeric ratings and free-text comments from participants), and Jupyter notebooks that reproduce the charts and figures used in the published paper. The analysis code uses principal component analysis and correlation statistics to examine patterns in how different demographic groups rated AI responses. This is not a tool to run or install. It is a data archive for researchers who want to study AI political neutrality, reproduce the paper's findings, or use the survey methodology as a starting point for related work.

Copy-paste prompts

Prompt 1
What is the PARETO dataset measuring, and how does it define political neutrality as balanced approval across political viewpoints?
Prompt 2
How do I run the final_analyses.ipynb notebook to reproduce the PARETO paper's figures? What Python packages does it require?
Prompt 3
What data files are in the survey_data directory of the PARETO dataset, and how are participant responses linked to model responses across files?
Prompt 4
How were AI model responses presented to survey participants in the PARETO study? What format are the stimuli PNG files and how are they organized?

Frequently asked questions

What is pareto?

A research dataset from a large-scale human study measuring whether AI responses to political questions receive balanced approval across political viewpoints, with Jupyter notebooks to reproduce the paper's figures.

What language is pareto written in?

Mainly Jupyter Notebook. The stack also includes Python, Jupyter Notebook, pandas.

How hard is pareto to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is pareto for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub humancompatibleai on gitmyhub

Verify against the repo before relying on details.