droidsaw

Name: droidsaw
Author: droidsaw

Scientist and worker dissecting an Android robot with a chainsaw

Art by pmjv_prahou. A gift to the last mage of the Open clan.

droidsaw takes a DEX file or Hermes bundle apart and puts it back together byte-for-byte — 5,767 DEX files from F-Droid recovered bit-identically under preservation mode, Hermes bytecode round-trip verified on v84, v96, v98, and v99. That test fails loudly when the format model has a hole: a string-table offset wrong by one, an alignment requirement missed, a padding byte forgotten — the re-emitted bytes diverge and the test says where. Most Android RE tooling works per-layer; droidsaw traces a JS value through the React Native bridge into Java as a single taint path. Hand it an APK and it unpacks the container, decompiles every layer, and can pipe the output through Semgrep and TruffleHog in the same pass (audit --mode=full). CLI and MCP server share one command surface. Pure Rust, BSD-3-Clause.

Install

cargo install droidsaw                 # the CLI
cargo install droidsaw --features mcp  # also installs the droidsaw-mcp server

cargo fetches the droidsaw-* library crates from crates.io and compiles everything locally. The
only prerequisites are a Rust toolchain (rustup) and a C compiler (cc/clang).
semgrep and trufflehog are optional (for audit --mode=full, --mode=semgrep, or --mode=trufflehog); YARA is built in. No
Java/Android SDK is needed.

Installing from automation or CI? Use cargo install --locked droidsaw — it builds against the
dependency versions pinned in the publish-time lockfile, so the same release installs reproducibly.

For fleet or CI installs, prefer the cargo auditable
variant — it embeds the full crate dependency list inside the binary itself (a compressed JSON
record in a dedicated linker section), so the artifact stays auditable long after the build
environment is gone:

cargo install cargo-auditable cargo-audit   # one-time tooling
cargo auditable install --locked droidsaw   # binary carries its own dependency inventory
cargo audit bin $(which droidsaw)           # audit the installed artifact, no manifest needed

cargo audit bin reads the embedded inventory and checks it against the RustSec advisory
database — the binary answers "what exactly is in you?" years later, with nothing but the file.

Working on droidsaw itself? Check out the droidsaw repo and the five droidsaw-* library repos as
siblings, then build droidsaw/ — its dev-loop [patch.crates-io] wires in the local ../droidsaw-*
crates instead of crates.io.

Quickstart

droidsaw info app.apk                                 # layer summary: bytecode + manifest + signing
droidsaw audit app.apk --mode=basic                   # fast hermetic audit (no subprocesses)
droidsaw audit app.apk --mode=basic --format sarif    # SARIF 2.1.0 for code scanning
droidsaw manifest app.apk | jq '.manifest.exported_components'
droidsaw decompile app.apk com.example.Foo            # DEX class → Java
droidsaw decompile app.apk 42                         # Hermes function index → JS
droidsaw xrefs app.apk --search 'api_secret|Bearer'   # who references this string?

Every command composes with jq; worked per-audience flows are in Playbooks below.

Inputs & output

Inputs: APK, XAPK, .hbc, .dex. Hermes bundles and DEX files are extracted from an APK automatically; co-located split_config.*.apk siblings are auto-merged (--no-auto-splits to disable).

Output: stdout is one JSON object, array, or NDJSON stream — nothing else. The documented plain-text exemptions: --version and --help (clap convention), hbc disassemble (instruction stream for differential tooling), scan trufflehog (raw string feed), and decompile --all --js (concatenated JS). Progress goes to stderr prefixed droidsaw: .

Exit codes: 0 success · 1 reserved for the opt-in audit --fail-on=<severity> gate (the audit completed, stdout carries the normal audit output, and at least one emitted finding is at or above the threshold — pure exit-code CI gating, no jq required) · 2 failure. Every failure — including a mistyped flag — is a typed JSON error envelope on stdout:

{
  "error": {
    "code": "USER_INPUT | PERMISSION | TRANSIENT | CONFIGURATION | INTERNAL",
    "operation": "audit",
    "message": "no such file: app.apk",
    "hint": "verify the path points to a readable APK/DEX/HBC file"
  }
}

Repeated runs on the same input produce bit-identical output.

What it finds

Patterns surfaced across a corpus of production apps spanning dating, social, fintech, banking, and health:

Key material in the JS heap — private-key operations running in Hermes with no native boundary, no memory zeroing. Visible because droidsaw reads the Hermes layer.
Signing-chain weaknesses at the cryptographic level — ROCA fingerprint (CVE-2017-15361), Fermat-factorable close primes (returns (p, q)), batch-GCD shared-prime recovery across a corpus (Bernstein quasilinear), Wiener-regime exponent fingerprint (e > N^0.75; full recovery).
Data crossing the React Native bridge into sinks it shouldn't reach — JS network input landing in Runtime.exec(), analytics SDKs receiving health or financial signals. Reported as a single taint path from JS source to Java sink.
Exported components without permission guards — activities and services reachable from any app, leaking auth tokens, OAuth codes, refresh tokens.
Staging infrastructure surviving release — internal endpoints, debug flags, hardcoded credentials across the manifest, DEX string pool, and Hermes global string table.

Cross-referencing is for understanding what an app does: trace a suspicious string to every function that references it, then decompile those functions to map the dispatch surface. A worked example on a stalkerware sample: Cerberus Anti-theft: Stalkerware RE.

Commands

Command	Output
`audit`	Security audit (`--mode=<basic\|full\|semgrep\|trufflehog>`)
`decompile`	Decompile DEX classes to Java or Hermes functions to JavaScript
`xrefs`	Cross-reference a string to every function that references it (DEX + HBC). "Who references this key?" is a query, not a grep.
`manifest`	AndroidManifest.xml analysis
`signing`	v1 / v2 / v3 / v4 signing block analysis
`strings`	Search strings across all layers (`--layer dex\|hbc\|native`, `--search <re>`)
`frida`	Generate Frida hook stubs for functions matching a string pattern
`diff`	Structural diff of two Hermes bundles (accepts APK or `.hbc`)
`deobf-strings`	Recover obfuscated strings by emulating a DEX method over argument sets
`info`	Container + bytecode layer summary: layers + manifest + signing
`inspect`	Container internals: `entries` (ZIP + anomalies), `elf`, `resources`, `webview`
`hbc`	Hermes subcommands (`info`, `functions`, `strings`, `decompile`, `disassemble`)
`dex`	DEX subcommands (`classes`, `methods`, `strings`)

Cross-layer taint paths (JS → bridge → Java → sink) are produced by audit, not a standalone command — see Cross-layer taint.

Further commands are grouped under umbrellas: scan <yara|sbom|trufflehog|semgrep|export>, inspect <entries|elf|resources|webview>, corpus <ingest|scan>, triage promote. Global flags cap resources per parse: --budget-mem, --budget-time, --single-thread (deterministic), --permissive-recovery (tolerant AXML). Run droidsaw --help or droidsaw <cmd> --help for the full surface.

Playbooks

Concrete flows for the four people droidsaw is built for. All commands are verified against the binary; substitute your own APK path.

App developer — audit your build, gate CI

droidsaw info app-release.apk                                  # what's in the build
droidsaw audit app-release.apk --mode=basic                    # fast hermetic audit (~10-30s, no subprocesses)
droidsaw audit app-release.apk --mode=basic --fail-on=high     # CI gate: exit 1 if any High+ finding (no jq)
droidsaw audit app-release.apk --mode=basic --format sarif > droidsaw.sarif   # GitHub code scanning
droidsaw scan sbom app-release.apk > sbom.json                 # CycloneDX 1.5 SBOM

Bug-bounty hunter — recon → decompile → trace

droidsaw info target.apk
droidsaw manifest target.apk | jq '.manifest.exported_components'   # attack surface
droidsaw signing target.apk                                        # weak keys / cert posture
droidsaw decompile target.apk com.example.AuthManager              # pull a class to source
# Cross-reference a secret-looking string to every function that references it (DEX + HBC).
# Output: {xrefs:[{layer, string, functions:["name(#id)", ...]}]}; then decompile a referencing function.
droidsaw xrefs target.apk --search 'authorization|Bearer|api_secret'
droidsaw audit target.apk --mode=basic | jq '[.findings[] | select(.id | test("TAINT"))]'   # JS→bridge→Java taint
droidsaw frida target.apk --search 'password|token|decrypt'        # Frida hook stubs

Product Security team — fleet scan & triage

droidsaw corpus ingest /path/to/apks/ --output corpus.db --tag 2026-q2          # signing/metadata DB (idempotent)
droidsaw corpus scan   /path/to/apks/ --min-severity high > fleet.ndjson        # batch NDJSON for a SIEM/store
DROIDSAW_SEMGREP_RULES=./rules/android.yml droidsaw audit app.apk --mode=full   # deep audit, your Semgrep rules
droidsaw audit app.apk --mode=basic --format sarif > app.sarif                  # SARIF into code scanning
sqlite3 corpus.db "SELECT rsa_modulus_hex, COUNT(*) c FROM signers GROUP BY rsa_modulus_hex HAVING c > 1"  # shared-prime hunt

Human-rights / at-risk-user defender — examine a suspected spyware sample safely

Static-only — droidsaw never runs the sample and never touches the network. It parses and decompiles in memory; that is exactly the property you want for a sample that may target a person. (Still handle the APK on an isolated host per your operational-security practice.)

droidsaw info suspect.apk
droidsaw manifest suspect.apk | jq '{pkg:.manifest.package, perms:.manifest.permissions, exported:.manifest.exported_components}'
droidsaw scan yara suspect.apk                                     # hooking frameworks, root/debugger evasion
droidsaw audit suspect.apk --mode=basic --stix-feed known-iocs.json   # match a local STIX 2.1 IOC feed
droidsaw strings suspect.apk --search 'location|microphone|camera|sms|contacts|keylog'
droidsaw xrefs   suspect.apk --search 'android.location|telephony.SmsManager|startForeground'

Look at: exported Services/Receivers without a permission guard (a remote-control surface); high-risk permission clusters (ACCESS_FINE_LOCATION, READ_SMS, RECORD_AUDIO, READ_CONTACTS, BIND_DEVICE_ADMIN); anti-analysis / hooking matches that signal evasion.

Cross-layer taint

audit --mode=basic and --mode=full run three taint passes. Results land in the taint_flows SQLite table.

HBC pass. User-controlled inputs seeded and propagated through Hermes functions. Detects DirectEval (CWE-95) and tainted args to NativeModule Call* ops. Records which arg positions carried taint.
DEX pass. Follows invoke-direct, invoke-static, and monomorphic invoke-virtual / invoke-interface across DEX boundaries via a cross-DEX class hierarchy index (CHA). Interprocedural depth: 4.
Bridge pass. @ReactMethod params seeded as taint sources, then run through the DEX pass. Seeding is restricted to the arg positions the HBC pass found tainted — only the JS-side values that actually carry taint become Java-side seeds. Falls back to all params when no HBC bundle is present.

15 taint sources × 15 sinks defined in droidsaw-common, including bridge edges that span the layers — sources like JS network input, React Native bridge params, and IntentExtra; sinks like Runtime.exec(), WebView.loadUrl, SQL execution, and file writes. A tainted value crossing into a JNI-native method is flagged (JNI_TAINTED_NATIVE_CALL) but not traced into native code — a documented stop, not a gap.

Per-layer scope

Layer	Status
APK container	Signing v1–v4, crypto (ROCA / Fermat / Wiener / batch-GCD), YARA (credential / packer / crypto / anti-analysis), SBOM, ELF metadata.
DEX → Java	Decompiler. Byte-identical roundtrip on F-Droid corpus under preservation mode (default emit differs on a 5.4% legacy-`dx` subset — see Correctness).
Hermes → JS	Decompiler. Parses v40–v100; byte-exact roundtrip verified on v84/v96/v98/v99. OXC round-trip validated on decompile output.
Native ELF	Hardening flags, JNI exports, relocation counts. No disassembly.
Dart AOT	Not supported.
IL2CPP	Not supported.

MCP server

droidsaw exposes its analysis surface over MCP. An agent can load an APK, run a cross-layer audit, query findings, decompile a class, and diff two Hermes bundles — all in one session, one schema.

The server is a second binary, droidsaw-mcp (cargo install droidsaw --features mcp). Transport is stdio only — an MCP client such as Claude Code spawns it as a child process over stdin/stdout; there is no network listener. Wire it in via .mcp.json:

{
  "mcpServers": {
    "droidsaw": {
      "type": "stdio",
      "command": "droidsaw-mcp",
      "args": ["--allowed-tool-classes=all", "--tool-tier=full"],
      "env": { "DROIDSAW_MCP_ROOT": "/path/to/your/analysis/root" }
    }
  }
}

Two dials plus a sandbox set the security posture: DROIDSAW_MCP_ROOT confines every path the server reads or writes; --allowed-tool-classes (default read-only,writes-tempfile) gates what it may do; --tool-tier (basic = 12-tool core workflow, full (default) = every tool) gates what the model sees. The class gate is per-tool — audit is classed spawns-subprocess even in basic mode, and triage needs manages-state. There is no built-in authentication, and large outputs stream to a tempfile rather than the context window. The args above (--allowed-tool-classes=all) is a full-trust local setup; on a shared or untrusted host drop them to keep the safe default classes and add --tool-tier=basic.

A session goes load(path=…) (required first) → audit(mode='basic') → query(sql='SELECT …') → investigate(rowid=…) → decompile(…) → triage(…). Every tool errors until load has been called.

Integrations

Tool	Surface
SQLite	Each writer produces its own schema: `scan export` → parsed-layer tables (`strings`, `functions`, `classes`, `edges`, `strings_fts`); `corpus ingest` → `apks` + `signers`; `audit --format unsigned-evidence` → the `findings` + `taint_flows` schema (rev 6). The `query` MCP tool runs read-only `SELECT` against the audit findings DB.
Semgrep	`audit --mode=semgrep` / `scan semgrep --persist`. Decompiled DEX source extracted to disk and fed to Semgrep. No rules ship bundled — supply your own via `--rules <path>` or `DROIDSAW_SEMGREP_RULES`.
TruffleHog	`audit --mode=trufflehog` / `scan trufflehog`. Extracted strings from every layer piped to TruffleHog. All hits land in the `credentials` view, each carrying TruffleHog's own `verified` flag.
YARA-X	`scan yara` / bundled in `audit`. YARA-X (Rust port). Bundled rule packs; custom rules accepted. Provenance-aware suppression.
STIX 2.1	`audit --stix-feed <path>`. Loads any STIX 2.1 bundle (file path; no network I/O). IOC matches against parsed APK content.
Frida	`frida` subcommand. Auto-generated hook stubs against functions that touch matched strings.

Architecture

Both decompilers follow the same pipeline. The middle stages live in droidsaw-common (generic over an I: Instr trait); the bundle crate supplies its own Insn type and language-specific sugar.

Stage	Module	Input → Output
decode	`<bundle>/decode.rs`, `<bundle>/parser/`	`&[u8]` → `Vec<Insn>`
CFG	`<bundle>/cfg.rs`, oracle in `common/graph/`	`Vec<Insn>` → basic blocks + edges
dominators	`common/graph/dominators.rs`	basic blocks → idom map
SSA (Braun)	`common/ssa/`, `<bundle>/ssa.rs`	basic blocks → `SsaFunction`
Expr IR	`<bundle>/expr.rs` (Hermes), `common/region/`	`SsaFunction` → expression tree
structure	`common/region/`, `<bundle>/structure.rs`	expression tree → `RegionTree`
sugar	`<bundle>/sugar.rs`, `hermes/decompile/`	`RegionTree` → `RegionTree`
emit	`dex/emit_dex.rs`, `hermes/emit.rs`	`RegionTree` → source bytes
validate	`tests/byte_identity_smoke.rs`, `tests/hbc_corpus_roundtrip.rs`	source bytes ≡ input (round-trip)

Deterministic IR — BTreeMap throughout, so output is stable across runs. Typed opcode enum. Cross-validated against reference disassemblers — DEX against dexdump, Hermes against hbcdump.

Correctness

Five gates. Each catches what the layer above it can't.

1. Round-trip disassembly

The strongest claim about a format parser is that it understands every byte. One way to test that: parse the file, regenerate the bytes from what was parsed, and check that they are identical to what you started with. A wrong rule anywhere — a string-table offset off by one, a missed alignment requirement, a padding byte misclassified — produces a divergence the test catches precisely.

DEX: 100% byte-identical on the F-Droid corpus (5,767 DEX files across 3,782 apps) under preservation mode. The 5.4% subset (309 files) that differs under default emit does so in 24 header bytes only — exclusively legacy-dx-toolchain non-canonical SHA-1 inputs; droidsaw recomputes correct checksums on default emit, or preserves verbatim under audit mode.

Hermes: byte-exact reconstruction on bytecode versions v84, v96, v98, v99 — header, global string table, function table, alignment, and metadata layout. Verified clean on public v96 corpus samples.

Verify locally:

cargo test -p droidsaw-dex byte_identity_smoke
cargo test -p droidsaw-hermes hbc_corpus_roundtrip

2. Fixture ratchets

DEX: 68 in-repo Java + Kotlin + R8 sources. COMPILE_FAIL = 1. SEMANTIC_FAIL = 0.

Hermes: v96 fixture matrix. COMPILE_FAIL = 0. SEMANTIC_FAIL = 0.

UNRECOGNIZED_REGION ratchet pinned per-fixture in tests/unrecognized_ratchet.rs. Any new region a recognizer fails to handle is a build break.

The ratchet only decreases. A fixture flip blocks merge.

3. Adversarial fuzz

libFuzzer targets: fuzz_parser, fuzz_opcode_decode, fuzz_cfg, fuzz_ssa, fuzz_emit_roundtrip (DEX), fuzz_emit_roundtrip_hbc (Hermes), fuzz_protector_recognizer, fuzz_enum_cross_class.

The parser and decoder targets for both DEX and Hermes ran for extended campaigns with zero panics, zero artifacts.

fuzz_emit_roundtrip{_hbc} runs the round-trip property under libFuzzer instrumentation on the full input space, not just the fixture set.

4. Cross-tool differential

DEX vs dexdump (Android SDK's DEX disassembler), used as a code-unit coverage oracle. Every class descriptor and every method (class, name, proto) triple dexdump -d enumerates must also appear in droidsaw-dex output. A missed class or method is a build break.

Hermes vs hbcdump (Meta's official disassembler):

Parse-side: header, global string table, function table compared byte-for-byte.
Instruction-level: 12,000 sampled (opname, operand_count) tuples compared across functions. Zero opcode disagreements.

5. Formal proofs

Kani (bounded model checking). 101 harnesses across the workspace on statements with decidable input domains: MUTF-8 codec totality, signing-block padding gate, LEB128 read/write round-trip, bit-field bounds, Hermes function_get u64-overflow guard (against a u128 oracle), MANIFEST.MF base64 positional gate, recursion-depth caps, per-tag truncation guards, base64 capacity arithmetic.

Lean 4. 20 proved theorems on statements that quantify over arbitrary input length or arbitrary CFG shape — out of Kani's bounded reach. AXML parser totality and acyclicity. Dominator antisymmetry, transitivity, and unique idom. Lattice monotonicity of the dataflow fixed point. Hermes try-catch RPO ordering. No sorry, no axiom. Each .lean file names the Rust function it backs via a RUST: comment; the correspondence is asserted in source and maintained by hand, not mechanically verified. The Lean 4 proofs are maintained in a separate droidsaw-lean project (Lake build, not a Cargo crate).

OXC round-trip. Every Hermes decompile output is parsed back by OXC (a Rust-native JavaScript parser and codegen). Output OXC rejects is annotated and returned, never silently dropped.

The compile-time floor on every non-test module:

#![deny(
    clippy::unwrap_used,
    clippy::expect_used,
    clippy::panic,
    clippy::unreachable,
    clippy::todo,
    clippy::arithmetic_side_effects,
)]

Suppressions on the panic family require a written PROOF: obligation. panic = "abort" is set workspace-wide — a stale PROOF obligation terminates the process at runtime, not just at lint.

Reporting issues

File an issue on GitHub with a reproduction — search
first, it may already be tracked:

A crash or panic — the input that triggers it (minimized if you can), or a bundle from droidsaw triage promote.
A decompilation gap or wrong output — the smallest input that reproduces, plus the output you got vs. expected.
A wrong or missing finding — the finding ID and the input, plus (false positive) why the flag looks incorrect, or (false negative) a reference showing the pattern droidsaw should catch.

Issue reports with reproductions are the most useful contribution; code PRs aren't currently solicited.

License

BSD-3-Clause.