gy15901580825/argus

★ 16PythonAudience · ops devopsComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((Argus))
    What it does
      Tests AI apps for vulnerabilities
      Sends adversarial inputs
      Grades results with AI judge
    Attack types
      Prompt injection
      Jailbreaks
      Hidden content attacks
      Multi-turn exploits
    Standards covered
      OWASP LLM Top 10
      MITRE ATLAS
    Output formats
      SARIF for GitHub
      JUnit XML
      HTML reports

mindmap root((Argus)) What it does Tests AI apps for vulnerabilities Sends adversarial inputs Grades results with AI judge Attack types Prompt injection Jailbreaks Hidden content attacks Multi-turn exploits Standards covered OWASP LLM Top 10 MITRE ATLAS Output formats SARIF for GitHub JUnit XML HTML reports

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run a security scan against your AI agent to find prompt injection vulnerabilities before shipping to production.

USE CASE 2

Integrate Argus into your CI/CD pipeline via the bundled GitHub Action to automatically block builds when critical AI security issues are found.

USE CASE 3

Generate a SARIF report from an Argus scan and upload it directly to GitHub Code Scanning for visibility in pull requests.

USE CASE 4

Try out Argus against the bundled demo agent to learn AI security testing without risking your real system.

Tech stack

PythongRPCHTTPGitHub Actions

Getting it running

Difficulty · moderate Time to first run · 30min

Requires writing a config file describing your agent's address and auth, a demo agent is bundled for testing without a live target.

Apache 2.0 license, use freely for any purpose including commercial, modify and distribute, but you must include the license and notice files.

In plain English

Argus is a tool that tests AI-powered applications for security weaknesses by attacking them the way a real adversary would. You point it at any AI agent that accepts HTTP requests, gRPC calls, or controls a web browser, and it sends hundreds of specially crafted inputs designed to make the agent misbehave, reveal secrets, or bypass its intended restrictions. The tests are grouped around well-known AI security frameworks, including the OWASP LLM Top 10 (a list of the most common ways AI language models fail) and MITRE ATLAS (a catalog of real-world AI attacks). Argus runs probes that attempt prompt injection, hiding malicious instructions inside documents the agent reads, jailbreaks that build up over multiple turns of conversation, and attacks that conceal content in invisible characters. There are over 160 individual probes in the bundled library. After each test run, Argus grades the results using a second AI model that acts as a judge. The judge evaluates whether each attack succeeded. Results come out in standard report formats that security and engineering teams already use: SARIF (which drops directly into GitHub Code Scanning), JUnit XML (which can block a build if critical issues are found), and HTML for human reading. Setting up a scan requires writing a small configuration file that describes your agent's address and authentication method, then running the argus-probe command-line tool. A bundled GitHub Action makes it straightforward to run scans automatically whenever code changes. The repo also includes a deliberately insecure demo agent so you can try everything without connecting to a real system first. Argus is designed for security testing, not for running inline as a production filter. It is released under the Apache 2.0 License.

Copy-paste prompts

Prompt 1

Using Argus, how do I write a configuration file and run a security scan against my AI agent that accepts HTTP requests? Show me the argus-probe command.

Prompt 2

How do I set up the Argus GitHub Action so it automatically runs AI security tests every time I push code changes?

Prompt 3

How do I interpret an Argus HTML report to understand which attacks succeeded against my AI agent and how severe they are?

Prompt 4

How do I run Argus against its bundled demo agent to try out prompt injection and jailbreak probes without connecting to a real system?

Open on GitHub → Explain another repo

← gy15901580825 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.