rion0709/agentshield

Analysis updated 2026-06-24

★ 1PythonAudience · developerComplexity · 3/5LicenseSetup · easy

Mindmap

mindmap
  root((agentshield))
    Inputs
      LLM prompts
      Tool call arguments
      User identifiers
    Outputs
      Blocked or sanitised calls
      Masked secrets
      Encrypted memory store
    Use Cases
      Block prompt injection
      Mask API keys in outputs
      Rate limit abusive users
      Encrypt local chat history
    Tech Stack
      Python
      scikit-learn
      TF-IDF
      AES-256
      PBKDF2
      Fernet

mindmap root((agentshield)) Inputs LLM prompts Tool call arguments User identifiers Outputs Blocked or sanitised calls Masked secrets Encrypted memory store Use Cases Block prompt injection Mask API keys in outputs Rate limit abusive users Encrypt local chat history Tech Stack Python scikit-learn TF-IDF AES-256 PBKDF2 Fernet

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Drop into an existing OpenAI Python app with two lines to block jailbreaks and homoglyph injection attacks.

USE CASE 2

Wrap a custom tool calling function with secure_agent to gate subprocess and eval calls before they execute.

USE CASE 3

Use the encrypted memory store to keep conversation history off disk in plain text for a desktop assistant.

USE CASE 4

Trial attack presets in the local browser dashboard to evaluate firewall coverage before a launch.

What is it built with?

Pythonscikit-learnTF-IDFFernetAES

How does it compare?

	rion0709/agentshield	a-bissell/unleash-lite	abhiinnovates/whatsapp-hr-assistant
Stars	1	1	1
Language	Python	Python	Python
Setup difficulty	easy	hard	hard
Complexity	3/5	4/5	3/5
Audience	developer	researcher	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

Just pip install plus a small setup script to set the security question, then two lines of code to enable.

Apache 2.0 license, free to use commercially with patent protection, as long as you keep the license notice and state any changes.

In plain English

AgentShield is a Python library that sits between an application and the AI model it talks to, watching the prompts and responses for signs of attack. The README describes it as a firewall for AI agents, meant to catch jailbreaks, prompt injections, and similar tricks before they reach the model, without forcing the developer to rewrite their existing code. The project lists several defense layers. There are pattern matchers that look for known jailbreak phrasings, base64 or hex evasion tricks, and zero-width characters. A homoglyph normalizer converts visually similar letters from other alphabets back to plain Latin, so an attacker cannot hide the word ignore by swapping in Greek or Cyrillic lookalikes. A small machine learning classifier, built from a TF-IDF vectorizer and a logistic regression model, is used to flag injection attempts it has not seen before. A time-based tracker watches request patterns per user to spot brute-force probing. There are also pieces that protect the host application itself. A tool-calling guard checks arguments before letting code call things like subprocess or eval. A data masking layer redacts API keys and other secrets in outgoing text. An encrypted local memory store, using AES-256 through Fernet, keeps saved conversations and credentials from sitting on disk in plain text, with the encryption key derived from a security question through PBKDF2. Installation is a pip install of the agentshield-firewall package. After running a small setup script to configure the security question, the developer adds two lines, an import and a call to agentshield.init, and the library monkey-patches the OpenAI client and outgoing HTTP requests so calls to AI endpoints are automatically inspected. A decorator named secure_agent is offered for wrapping specific functions instead. The README also describes a local browser dashboard for trying attack presets and several test scripts for the firewall, the auth layer, and the auto-protect hooks. The project is released under the Apache 2.0 license.

Copy-paste prompts

Prompt 1

Install agentshield-firewall, run the setup script, and add agentshield.init to my existing OpenAI Python app.

Prompt 2

Wrap my run_query function with the secure_agent decorator from AgentShield so subprocess calls are gated.

Prompt 3

Tune the TF-IDF plus logistic regression classifier in AgentShield by adding 20 of my own jailbreak attempts as training data.

Prompt 4

Configure the data masking layer in AgentShield to redact AWS access keys and Stripe secret keys in outbound responses.

Prompt 5

Use the AgentShield dashboard to run the homoglyph attack preset against my Claude integration and report any prompts that get through.

Frequently asked questions

What is agentshield?

Python firewall library for AI agents that monkey-patches the OpenAI client to inspect prompts and responses for jailbreaks, prompt injection, secret leaks, and unsafe tool calls.

What language is agentshield written in?

Mainly Python. The stack also includes Python, scikit-learn, TF-IDF.

What license does agentshield use?

Apache 2.0 license, free to use commercially with patent protection, as long as you keep the license notice and state any changes.

How hard is agentshield to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is agentshield for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.