Test a trained image classifier to see how well it resists pixel-level adversarial attacks.
Apply adversarial training to a PyTorch model to make it harder to fool with crafted inputs.
Evaluate whether your model leaks private training data through its predictions using an inference attack.
Simulate a model extraction attack to check if your API exposes too much about the underlying model.
Requires a working ML framework such as TensorFlow or PyTorch plus specific optional dependencies for the attack or defense you want to use.
This is a Python library for testing and defending machine learning models against security threats. The core idea is that AI models can be fooled or manipulated in specific ways, and this toolkit gives researchers and developers the methods to both carry out those attacks (to find weaknesses) and build defenses against them. The library covers four main categories of threat. Evasion is when an attacker subtly alters input data, like tweaking pixels in an image, to trick a model into making wrong predictions. Poisoning is when an attacker corrupts the training data before a model is trained, causing the resulting model to behave badly in specific situations. Extraction is when an attacker probes a model repeatedly to steal a copy of it without direct access to its weights. Inference is when an attacker uses a model's outputs to learn private information about the data it was trained on. The toolbox works with most popular machine learning frameworks including TensorFlow, PyTorch, Keras, and scikit-learn. It also supports different types of input data: images, audio, video, and tabular data. Common tasks like image classification, object detection, and speech recognition are all covered. This project is hosted by the Linux Foundation AI and Data Foundation and was partially funded through a US Defense Advanced Research Projects Agency contract. It is open source under the MIT license and accepts contributions. There is a Slack workspace for community discussion. This is a technical research tool intended for machine learning engineers and security researchers who want to audit or harden AI systems. If you are not working directly with machine learning models in code, this library is not something you would use directly.
← trusted-ai on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.