explaingit

saileaxh/iida-dfs

15PythonAudience · researcherComplexity · 3/5Setup · moderate

TLDR

An IDA Pro plugin that uses machine learning to find matching functions between two binary files, letting you carry over function labels from an old version of a program to a newer one automatically.

Mindmap

mindmap
  root((iida-dfs))
    What it does
      Match functions across binaries
      Transfer function labels
      Proof of concept ML tool
    How it works
      Numeric function embeddings
      Clipboard data export
      Candidate match ranking
      CPU-only inference
    Tech Stack
      Python
      IDA Pro plugin
      ONNX Runtime
      NumPy
    Limitations
      1100 sample training set
      No active maintenance
      Proof of concept only
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Carry over function labels from an older binary to a newer version of the same software without re-analyzing everything by hand.

USE CASE 2

Find which functions in a patched binary correspond to functions you already analyzed in the original build.

USE CASE 3

Speed up reverse engineering of a new binary version when you have an annotated older version as a reference.

Tech stack

PythonIDA ProONNX RuntimeNumPy

Getting it running

Difficulty · moderate Time to first run · 30min

Requires IDA Pro (paid tool), install numpy and onnxruntime in IDA's Python environment and copy the plugin file to the IDA plugins folder.

No license information was mentioned in the explanation.

In plain English

This is a plugin for IDA Pro, a tool that security researchers and reverse engineers use to examine compiled programs (binaries) without access to their source code. The plugin's job is to help you find functions in one binary that match functions in another binary, even when the two binaries are different versions of the same software. The README is written in Chinese and describes a machine-learning approach. A pre-trained model converts each function in a binary into a numeric representation, then compares those representations to find close matches across files. The typical scenario is: you have already analyzed an older version of a program and labeled its functions, and now you want to carry those labels over to a newer version automatically. The workflow has two steps. In the first binary you export a function's data to the clipboard via the plugin menu. In the second binary you run a match operation using that clipboard data, set how many candidate matches to show, and wait for results. The plugin runs entirely on CPU, peaking at around 1100 MB of memory on files with 40,000 functions. Installation means copying the plugin file into IDA's plugins folder and installing two Python packages (numpy and onnxruntime). The README notes the model was trained on roughly 1,100 samples and is presented as a proof of concept rather than a production-grade tool. The author does not plan active maintenance.

Copy-paste prompts

Prompt 1
I installed iida-dfs in IDA Pro. Walk me through exporting a function from binary A to the clipboard and then running the match operation in binary B to find the corresponding function.
Prompt 2
The iida-dfs model was trained on 1100 samples. How would I retrain it on my own binary corpus to improve match accuracy for a specific software family?
Prompt 3
I'm reverse engineering two versions of the same Windows executable with IDA Pro and iida-dfs. How do I interpret the candidate match scores to decide which result to trust?
Prompt 4
Help me set up the iida-dfs plugin: where do I place the plugin file in the IDA Pro directory and how do I install numpy and onnxruntime in IDA's bundled Python?
Open on GitHub → Explain another repo

← saileaxh on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.