explaingit

trusted-ai/adversarial-robustness-toolbox

5,983PythonAudience · researcherComplexity · 4/5LicenseSetup · moderate

TLDR

A Python library for testing and defending machine learning models against security attacks, covering evasion, poisoning, model theft, and privacy leakage, used by ML engineers and security researchers to audit and harden AI systems.

Mindmap

mindmap
  root((ART))
    What it does
      Test AI for weaknesses
      Build AI defenses
    Attack types
      Evasion attacks
      Poisoning attacks
      Model extraction
      Inference attacks
    Supported frameworks
      TensorFlow
      PyTorch
      Keras
      scikit-learn
    Data types
      Images
      Audio
      Tabular data
    Audience
      ML engineers
      Security researchers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Test a trained image classifier to see how well it resists pixel-level adversarial attacks.

USE CASE 2

Apply adversarial training to a PyTorch model to make it harder to fool with crafted inputs.

USE CASE 3

Evaluate whether your model leaks private training data through its predictions using an inference attack.

USE CASE 4

Simulate a model extraction attack to check if your API exposes too much about the underlying model.

Tech stack

PythonTensorFlowPyTorchKerasscikit-learn

Getting it running

Difficulty · moderate Time to first run · 30min

Requires a working ML framework such as TensorFlow or PyTorch plus specific optional dependencies for the attack or defense you want to use.

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

This is a Python library for testing and defending machine learning models against security threats. The core idea is that AI models can be fooled or manipulated in specific ways, and this toolkit gives researchers and developers the methods to both carry out those attacks (to find weaknesses) and build defenses against them. The library covers four main categories of threat. Evasion is when an attacker subtly alters input data, like tweaking pixels in an image, to trick a model into making wrong predictions. Poisoning is when an attacker corrupts the training data before a model is trained, causing the resulting model to behave badly in specific situations. Extraction is when an attacker probes a model repeatedly to steal a copy of it without direct access to its weights. Inference is when an attacker uses a model's outputs to learn private information about the data it was trained on. The toolbox works with most popular machine learning frameworks including TensorFlow, PyTorch, Keras, and scikit-learn. It also supports different types of input data: images, audio, video, and tabular data. Common tasks like image classification, object detection, and speech recognition are all covered. This project is hosted by the Linux Foundation AI and Data Foundation and was partially funded through a US Defense Advanced Research Projects Agency contract. It is open source under the MIT license and accepts contributions. There is a Slack workspace for community discussion. This is a technical research tool intended for machine learning engineers and security researchers who want to audit or harden AI systems. If you are not working directly with machine learning models in code, this library is not something you would use directly.

Copy-paste prompts

Prompt 1
Using the Adversarial Robustness Toolbox in Python, show me how to generate an FGSM evasion attack against a PyTorch image classifier and measure how accuracy drops.
Prompt 2
Using ART, apply adversarial training to a scikit-learn classifier to make it more robust against poisoning attacks.
Prompt 3
Show me how to use adversarial-robustness-toolbox to run a membership inference attack on a Keras model and interpret the output.
Prompt 4
Using ART model extraction tools in Python, simulate an attack that copies a black-box model by querying it repeatedly.
Prompt 5
Set up ART to run an evasion attack against an object detection model and compare results with and without a defense applied.
Open on GitHub → Explain another repo

← trusted-ai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.