explaingit

ajay-bhargava/watermark-remover

0PythonAudience · developerComplexity · 3/5ActiveLicenseSetup · moderate

TLDR

A Python CLI and FastAPI service that removes watermarks from images you own, using the Qwen-Image-Edit model running on a Modal H100 GPU.

Mindmap

mindmap
  root((watermark-remover))
    Inputs
      PNG JPEG WebP image
      Cleanup prompt
      Seed and steps
    Outputs
      Edited base64 image
      Model metadata
      Latency
    Use Cases
      Remove stamps you own
      Clean overlay text
      Batch image cleanup
    Tech Stack
      Python
      FastAPI
      Modal
      Diffusers
      Qwen-Image-Edit

Things people build with this

USE CASE 1

Self-host a serverless image watermark cleanup API on Modal

USE CASE 2

Remove stamped text from images you have permission to edit

USE CASE 3

Call a deployed cleanup endpoint from a browser image picker

USE CASE 4

Run a deterministic image edit pipeline with custom prompts and seeds

Tech stack

PythonFastAPIModalDiffusersHuggingFace

Getting it running

Difficulty · moderate Time to first run · 30min

Requires a Hugging Face token in a .env file and a Modal account, and the first GPU call takes 30 to 60 seconds while weights download.

MIT license, so you can use, modify, and redistribute the code freely as long as you keep the copyright notice.

In plain English

This repository is a small Python tool for removing visible overlays such as watermarks and stamped text from images. The author frames it carefully as an authorized cleanup tool, meaning it is meant for images you own or have permission to edit, not for stripping watermarks off other people's work. The README repeats this several times, and the command-line interface even requires you to pass an --authorized flag before it will run a cleanup. The license is MIT. Under the hood the image editing is done by a model from Alibaba called Qwen-Image-Edit-2511, accessed through the Hugging Face Diffusers library. The interesting part of the project is how it is deployed. Rather than asking the user to set up a GPU machine, the code is wrapped in a FastAPI service that runs on Modal, a serverless platform that gives the function an H100 GPU on demand. A configuration setting keeps one container warm so the model stays loaded between calls and responses come back quickly. The project offers two ways to use it. The first is a command-line tool installed as watermark. It has subcommands to sync a Hugging Face token into a Modal secret, run the API in development mode, deploy it as a warm service, and call a deployed endpoint with either a local image path or an image URL. A separate browser-images subcommand looks at an open browser surface and points out candidate images for cleanup. The second way is to call the deployed FastAPI service directly. The service exposes a health endpoint, a cleanup endpoint, and an edit endpoint that share the same handler. Requests pass the image as a base64 string along with a prompt, a seed, the number of inference steps, two guidance values, and an output format. The response returns the edited image as base64 plus model metadata and the latency. A few configuration details are spelled out in the README. The project expects a .env file containing a Hugging Face API token, and the same token can also be named HF_TOKEN or HUGGING_FACE_HUB_TOKEN if that is more familiar. Modal authentication uses modal setup. On the first GPU call the Qwen weights are downloaded into a Modal volume so later calls reuse the cache. The README quotes a first-call latency of thirty to sixty seconds and follow-up latency of five to fifteen seconds. Features listed include PNG, JPEG, and WebP input and output, deterministic results through a configurable seed, custom prompts beyond the default overlay cleanup wording, and the option to point at remote URLs as well as local files. The README closes with an ethical use note that restates the authorization requirement: do not use this to remove watermarks from copyrighted or third-party content without permission.

Copy-paste prompts

Prompt 1
Deploy watermark-remover to Modal step by step including the Hugging Face token setup
Prompt 2
Write a Python client that sends a local PNG to the deployed cleanup endpoint and saves the result
Prompt 3
Explain the --authorized flag and how to wire it into a batch script that processes a folder of images
Prompt 4
How do I change the default cleanup prompt to remove a specific kind of overlay like a date stamp
Prompt 5
Reduce the first-call latency on the Modal deployment by tuning the warm container config
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.