Discover what cryptographic algorithm such as SM3 a protected Android app function is secretly using.
Analyze obfuscated ARM64 native libraries from Android apps to recover their original logic.
Drive long-running reverse-engineering analysis sessions using an AI agent with limited context memory.
Audit the reasoning trail for algorithm identification via the hypothesis ledger and pipeline state.
Requires Python and the Triton symbolic execution library. Documentation is primarily in Chinese. Optional LLM API integration needs separate configuration. Designed for security researchers with reverse-engineering experience.
clark-utov is a tool for reverse-engineering algorithms that have been deliberately hidden inside Android apps. Specifically, it targets native code libraries compiled for ARM64 processors and protected with obfuscation systems called VMP and OLLVM, which scramble a program's instructions to make them very hard to read. The goal is to figure out what a protected function actually does, for example to discover that a particular routine implements a specific cryptographic algorithm like SM3. The tool works by consuming an instruction trace, which is a recording of every low-level operation the target function performed during execution. That trace is fed through a multi-stage analysis pipeline labeled S1 through S5. Each stage narrows down the possibilities, and the results are stored in what the project calls a hypothesis ledger, an auditable log of conclusions about what the algorithm is and how it works, along with the evidence supporting each conclusion. A significant part of the design is aimed at making the tool work well when driven by an AI language model agent, particularly agents that can only hold a limited amount of information in memory at once. Instead of requiring the agent to remember dozens of steps, clark-utov externalizes all the tracking into the ledger and structured pipeline state, so the agent only needs to make one bounded decision at a time based on what the tool surfaces to it. The README describes this as a way to let narrow-context agents do long-running analytical work without losing track of where they are. The pipeline also includes a symbolic execution component using a library called Triton, a blue-team review step, parity checks to catch incorrect conclusions, and an optional mode that calls a language model API for generating and testing hypotheses. The project is written primarily in Chinese documentation but the codebase is in Python.
← clarkluoluo on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.