User Tools

Site Tools


megabitchip:2026-04-21_simulation

MegabitChip — Session 2026-04-21 Part 2 (Simulation & MMIO-Diff)

Continues the 2026-04-21 reloc-splice session. Goal: close the 47/54 reach gap with simulation tooling.

Result

  • Three new validation tools built: lockstep.py, mmio_diff.py,

check_asm.sh.

  • Four more bugs found and fixed.
  • MMIO-write parity: 82 of ~1000 writes match identically before first

divergence — concrete proof the early boot path is source-rebuilt.

  • Superseded register-level lockstep with MMIO-trace diff (clang

reg-alloc noise obscured real signal).

Tools added

check_asm.sh

Structural asm-diff gate. Classifies each benchmark dir as ASM_EXACT / ORDER_DIFF / REG_DIFF / BRANCH_DIFF / STRUCTURAL_DIFF. Current: 100 tested, 3 EXACT, 18 REG_DIFF (same mnemonics + offsets, different regs — the useful audit signal), 57 BRANCH_DIFF (inflated by partial-port stubs), 22 STRUCTURAL_DIFF.

lockstep.py

Two Unicorn instances (vendor + rebuilt), step one insn at a time, diff all x-regs + sp + pc + nzcv. Supports –fn-entry-only to suppress inside-function noise.

Verdict: works mechanically but fires false alarms on clang register allocation (“vendor puts intermediate in x0, our port puts it in x8”). Even filtered to function entries, caller-saved-but-stale regs trigger. Use for targeted investigation, not as a gate.

mmio_diff.py

Log MMIO writes (addr, size, val, caller_pc) from each run; diff sequences in order. MMIO is the silicon-observable behavior — if the write sequence matches, rebuild is behaviorally equivalent regardless of how clang chose registers.

First divergent write pinpoints the exact bug in one line of output. This replaces reach-bisection + lockstep as the primary validation gate going forward.

Bugs fixed

  • 54_FUN_9a68 — dst/src swap. Vendor copies *arg**(DAT);

our port had it reversed. Surfaced by lockstep (first divergence

  at step 35, mid-copy).
* **46_FUN_2e88** (MR read helper) — args ''mr_addr''/''byte_index''
  were swapped in the port's C signature. Vendor's asm uses w2 for
  shift amount (byte_index) and w3<<8 for MRCTRL1 (mr_addr). Our
  port had them reversed.
* **17_FUN_2340** (MR submit) — vendor ends ''mov w0, #0; ret''
  explicitly returning 0. Our ''void'' port preserved whatever
  clang left in x0 (often the ch_base ptr = 0xfe000000). Same class
  as ''fn_27e0''. Change to ''int mr_submit(...) { ...; return 0; }''
* **113_FUN_4f8 case-2 sub-0** — Ghidra mis-decompiled the BUS_GRF
  register addresses. Vendor writes ''0xFD5F4000'' and ''0xFD5F800C''
  for (grp=2, sub=0); our port wrote ''BUS_GRF_BASE_CFG'' (0xFD5F0000)
  and ''BUS_GRF_DDR_ROUTE'' (0xFD5F0004). Fixed by hardcoding the
  actual vendor addresses.

Status

  • Reach gate: still 47/54 — the 4 fixes each reveal the next

MMIO-level divergence; reach-count is no longer a discriminating

  signal.
* **MMIO parity gate** (new primary): writes 1..82 byte-identical,
  write 83 reveals ''fn_62d8'' variant=1-vs-0 discrepancy (likely
  caller passing wrong arg). Keeps peeling bug-by-bug.
* Vendor total MMIO writes: **1007** in 500k insn budget.
  Rebuilt total: **253** (stops short because of divergence, not
  because 253 writes are all).

Next steps

  • Continue mmio_diff-driven debug from write 83 onward. Each divergent

write surfaces one Ghidra-decompile error or one ABI mismatch.

  • Consider dumping the FULL 1007-write vendor trace and using it as a
    • *spec file**: every future rebuild must reproduce this exact

sequence byte-for-byte.

  • When 1007/1007 match: move to Phase 4 (bare-metal on ampere via

meitner rkdeveloptool). That's the final ground-truth check.

Files added

  • ~/src/rk3588-ddr-decompiled/lockstep.py
  • ~/src/rk3588-ddr-decompiled/mmio_diff.py
  • ~/projects/AMPere/benchmark/check_asm.sh
  • ~/projects/AMPere/benchmark/reloc_bisect.sh

Memories updated

  • feedback_megabitchip_reloc_splicer.md — added section on

MMIO-diff as the superior gate, plus the two new bug-class

  examples (fn_9a68 direction, fn_62d8 address confusion).

Observation

The iteration pace felt like peeling an onion — each bug fixed revealed the next. But that IS the correct shape for matching-decomp with semantic tests: the MMIO sequence is the contract, each mismatch is a localized bug, and the tools converge us toward the vendor spec. Much more principled than register-level lockstep, which is too noisy for compiler-portable C ports.

A fully verified MMIO trace becomes a permanent regression oracle — useful both for this project and for any future Rockchip DDR reverse-engineering work. The .mmio-trace file is the real deliverable.

megabitchip/2026-04-21_simulation.txt · Last modified: by markus_fritsche