i-am-manware/dating-app-behavioural-analysis-for-secure-girls

Analysis updated 2026-06-24

★ 12Jupyter NotebookAudience · researcherComplexity · 3/5Setup · moderate

Mindmap

mindmap
  root((dating-analysis))
    Inputs
      Annotated profile dataset
      Rater scores
      Parquet files
    Outputs
      EDA figures
      Factor loadings
      ML model AUC
      Prescription table
    Use Cases
      Reproduce swipe study
      Try factor analysis on ratings data
      Practice SHAP and GAM modelling
    Tech Stack
      Python
      pandas
      scikit-learn
      XGBoost
      SHAP
      UMAP

mindmap root((dating-analysis)) Inputs Annotated profile dataset Rater scores Parquet files Outputs EDA figures Factor loadings ML model AUC Prescription table Use Cases Reproduce swipe study Try factor analysis on ratings data Practice SHAP and GAM modelling Tech Stack Python pandas scikit-learn XGBoost SHAP UMAP

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Reproduce the dating-app swipe study and inspect the 16 EDA figures

USE CASE 2

Reuse the EFA, PCA, t-SNE, and UMAP latent-variable pipeline on a similar rated dataset

USE CASE 3

Compare eight classifier baselines plus a GAM and SHAP explanations on a small dataset

USE CASE 4

Study a worked example of within-cohort behavioural data analysis with a research-style report

What is it built with?

Pythonpandasscikit-learnXGBoostSHAPUMAP

How does it compare?

	i-am-manware/dating-app-behavioural-analysis-for-secure-girls	2arons/lcel-forge	lfrincond/seismic_imaging26
Stars	12	11	13
Language	Jupyter Notebook	Jupyter Notebook	Jupyter Notebook
Setup difficulty	moderate	easy	hard
Complexity	3/5	2/5	4/5
Audience	researcher	developer	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Notebooks must be run in order because each writes parquet files the next consumes, and the stack pulls in XGBoost, LightGBM, CatBoost, PyGAM, SHAP, factor-analyzer, and UMAP.

In plain English

This project is a Jupyter notebook pipeline that takes a hand-annotated dataset of 123 male dating-app profiles and looks at which features predict a right swipe. The profiles were rated by five women described in the README as securely-attached, and 23.6% of the profiles received a right swipe. The author treats the result as a within-cohort study, not a population average. The repository is organised as five notebooks that run in order: data cleaning and parquet export, exploratory data analysis with 16 figures, deeper feature-level analysis, latent-variable analysis using exploratory factor analysis plus PCA, t-SNE and UMAP, and a final modelling notebook with eight machine-learning models, SHAP, a GAM, and a prescription table. The headline finding reported in the README is that two latent factors, labelled Psychological Safety and Visual Appeal, account for 99.3% of the swipe decisions in this dataset. The strongest individual predictor is the rater-inferred emotional_stability score. All eight models reach an AUC of 1.0 on the held-out test set, which the author attributes to high rater agreement rather than overfitting. Several common beliefs are reported as not supported by the data. Height shows no statistically significant correlation with swipe outcome in this sample. Shirtless photos in the sample receive a 0% swipe rate. Status correlates with swipes raw but drops to non-significant after controlling for perceived attractiveness. The README also notes that photo quality and warmth matter more than the number of photos. To reproduce the work, the README lists Python 3.10 or newer plus pandas, scikit-learn, XGBoost, LightGBM, CatBoost, PyGAM, SHAP, factor-analyzer, UMAP, openpyxl, and pyarrow. Notebooks must be run in sequence because each one writes parquet files the next one reads. A 13-section research-style report covering the methods, findings, and limitations is included as report.md.

Copy-paste prompts

Prompt 1

Walk me through the five notebooks in order and explain what parquet outputs each one produces

Prompt 2

Show me how the Psychological Safety and Visual Appeal factors are derived in the EFA notebook

Prompt 3

Adapt the modelling notebook to use a different held-out split and report AUC drop

Prompt 4

Run the SHAP analysis on the XGBoost model and surface the top five features

Prompt 5

Rewrite the prescription table cell so it exports to CSV instead of inline markdown

Frequently asked questions

What is dating-app-behavioural-analysis-for-secure-girls?

Jupyter notebook pipeline that analyses 123 hand-rated male dating profiles to identify which latent factors and features predict a right swipe from five raters.

What language is dating-app-behavioural-analysis-for-secure-girls written in?

Mainly Jupyter Notebook. The stack also includes Python, pandas, scikit-learn.

How hard is dating-app-behavioural-analysis-for-secure-girls to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is dating-app-behavioural-analysis-for-secure-girls for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.