Analysis updated 2026-05-18
Automatically clean uploaded files in an S3-backed application to remove macros and scripts before storing them.
Run a local HTTP file-sanitizing sidecar next to a web app that accepts uploads, in any programming language.
Quarantine uploaded files that cannot be proven safe using a fail-closed AWS pipeline.
Sanitize PDF and Office uploads in CI or on-premises without requiring AWS credentials.
| douglasmun/aws-cdr-gateway | adeliox/klein-head-swap | ats4321/ragit | |
|---|---|---|---|
| Stars | 4 | 4 | 4 |
| Language | Python | Python | Python |
| Setup difficulty | hard | moderate | moderate |
| Complexity | 4/5 | 3/5 | 2/5 |
| Audience | ops devops | designer | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires live AWS credentials and SAM CLI or Terraform to deploy the cloud pipeline, the local FastAPI service runs without AWS and starts in minutes.
This tool automatically cleans uploaded files before your application trusts them. It strips potentially dangerous active content from Office documents, PDFs, and images, then routes the cleaned version to a safe storage location or sends the original to a quarantine bucket if it cannot be made safe. The process is called Content Disarmament and Reconstruction, or CDR. The cloud version runs as a serverless AWS function. When someone uploads a file to an S3 storage bucket, it triggers this pipeline automatically. Word documents, Excel spreadsheets, PDFs, and common image formats all pass through specific cleaning routines. For Office files, it removes macros, embedded scripts, external data connections, and other components that could execute code when someone opens the document. For PDFs, it strips JavaScript, auto-open actions, embedded files, and form actions. For images, it re-encodes the file from scratch to remove metadata that could carry malicious content. Files in older formats like the original .doc.xls, and .ppt are sent straight to quarantine because their internal structure makes safe reconstruction too risky. The tool follows a fail-closed principle: anything it cannot confirm as safe goes to quarantine. It never labels a file as sanitized unless the cleaning has been positively verified. A second mode runs the same cleaning engine as a plain HTTP service on your own computer or server, with no AWS account required. You send it a file via a web request and it returns the cleaned version. This makes it useful as a sidecar next to any web application that accepts uploads, regardless of what programming language the main application uses. The repository includes thorough test coverage (227 tests), deployment guides for AWS using either SAM or Terraform, Docker and Kubernetes deployment instructions for the local service, and documentation comparing its coverage against other known file security tools.
A serverless AWS pipeline that strips macros, scripts, and active content from Office files, PDFs, and images before they reach your application. Also runs as a local HTTP service with no cloud account needed.
Mainly Python. The stack also includes Python, AWS Lambda, FastAPI.
Setup difficulty is rated hard, with roughly 1h+ to a first successful run.
Mainly ops devops.
This repo across BitVibe Labs
Verify against the repo before relying on details.