dariia-m/r-ladies-rome-text-analysis

Analysis updated 2026-06-24

★ 2JavaScriptAudience · dataComplexity · 2/5Setup · moderate

Mindmap

mindmap
  root((r-ladies-rome-text-analysis))
    Inputs
      EUvsDisinfo claims CSV
      Quarto slides
    Outputs
      Rendered HTML deck
      Wordclouds and plots
    Use Cases
      Learn tidytext
      Run local LLMs in R
      Topic modeling demo
    Tech Stack
      R
      tidytext
      mall
      Ollama
      Quarto

mindmap root((r-ladies-rome-text-analysis)) Inputs EUvsDisinfo claims CSV Quarto slides Outputs Rendered HTML deck Wordclouds and plots Use Cases Learn tidytext Run local LLMs in R Topic modeling demo Tech Stack R tidytext mall Ollama Quarto

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Learn the tidytext pipeline from sentences to word frequencies and sentiment

USE CASE 2

Reproduce dictionary sentiment analysis with AFINN, Bing, and NRC lexicons

USE CASE 3

Run local LLMs through Ollama for classification, summarization, and translation in R

USE CASE 4

Build LDA topic models and bigram network plots on a real dataset

What is it built with?

RtidytextmallOllamaQuartoLDA

How does it compare?

	dariia-m/r-ladies-rome-text-analysis	901d3/ditherxyr.js	ash310u/awesome-ai-stack
Stars	2	2	2
Language	JavaScript	JavaScript	JavaScript
Last pushed	—	2026-06-20	—
Maintenance	—	Active	—
Setup difficulty	moderate	moderate	easy
Complexity	2/5	2/5	2/5
Audience	data	developer	vibe coder

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Need to install Ollama, pull the llama3.2 model, and install ten R packages before rendering.

In plain English

This repository contains the slides and supporting files for a talk given at R-Ladies Rome in May 2026. The talk is titled From Dictionaries to LLMs: Text Analysis in R, and was prepared by Dariia Mykhailyshyna of the Kyiv School of Economics. The README describes a 45-minute walkthrough of a full text-analysis pipeline written in R, and links to a published HTML version of the slide deck. The example dataset for the talk is a public collection of pro-Russian disinformation claims tracked by EUvsDisinfo, downloaded from Kaggle and covering the period from January 2015 to January 2020. The repository keeps a copy of this data as data.csv alongside the source slides and the rendered deck. The first half of the talk covers what the author calls the tidytext pipeline. This means breaking sentences into individual words using the tidytext package, removing common stopwords plus a custom list, and then producing word frequencies, wordclouds, and bar plots. From there the talk moves to dictionary-based sentiment analysis using the AFINN, Bing, and NRC lexicons, looking at how sentiment shifts over time. It then covers topic modeling with LDA, and bigram and word network plots that show which terms tend to appear together. The second half moves to large language models through the R package called mall. The talk shows how to run local LLMs through Ollama, which avoids paying for an API, and walks through helper functions such as llm_sentiment, llm_classify, llm_extract, llm_summarize, llm_verify, llm_translate, and llm_custom. The author also discusses when a simple dictionary lookup is the right tool and when reaching for a language model is worth the extra cost. To reproduce the slides, the README lists the R packages you need to install, including tidyverse, tidytext, stopwords, wordcloud, topicmodels, igraph, ggraph, textdata, mall, and ollamar. You then install Ollama, pull the llama3.2 model, and render the Quarto document with a single quarto render command. The repository also ships a smoketest.R script that runs every R chunk in one pass, which is useful when debugging the pipeline. The README closes with a pointer to Workshops for Ukraine, a charity R workshop series.

Copy-paste prompts

Prompt 1

Set up Ollama and the mall package locally and rerun the llm_sentiment example from this repo on my own CSV

Prompt 2

Adapt the tidytext stopwords and wordcloud code to a different news dataset

Prompt 3

Help me debug the smoketest.R script when one of the LDA chunks fails

Prompt 4

Walk me through when to use AFINN versus an Ollama llama3.2 model for sentiment

Frequently asked questions

What is r-ladies-rome-text-analysis?

Slides and code for an R-Ladies Rome talk on text analysis in R, covering the tidytext pipeline, dictionary sentiment, topic modeling, and local LLMs via the mall package.

What language is r-ladies-rome-text-analysis written in?

Mainly JavaScript. The stack also includes R, tidytext, mall.

How hard is r-ladies-rome-text-analysis to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is r-ladies-rome-text-analysis for?

Mainly data.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.