Extended session on top of the reloc-splice + mmio-diff work landed earlier the same day. Focus: close the gap between write-sequence equality (mmio_diff green) and actually running on silicon without bricking. Tools, audits, and three new monster ports shipped; six silicon-hostile bugs caught pre-flash across three bug classes.
| # | Class | Case |
|---|---|---|
| 1 | ld unresolved → 0 NULL deref | fn_9a68 DAT_00012B70 case-mismatch |
| 2 | same | fn_7730 DAT_00010ba8 missing from DATA_SYMS |
| 3 | same | fn_7730 DAT_00010c2c missing from DATA_SYMS |
| 4 | same | fn_7730 DAT_00012b50 missing from DATA_SYMS |
| 5 | C early-return skips shared tail | fn_3268 0x208 RMW pair skipped when bit-31 set |
| 6 | Port is read-only where vendor writes | fn_1c14 rebuilt as no-op; vendor save/restore DDRPHY training bank |
Class 1: ld –unresolved-symbols=ignore-all silently zeros undefined
externs. A case-mismatched or missing DATA_SYMS entry becomes an
adrp resolving to page 0x0, and ldr returns whatever junk lives
at zero. mmio_diff is blind to this because downstream MMIO writes
still match vendor.
Class 2: C port uses early-return where vendor's asm has the conditional branch jump into a shared tail. Two 0x208 read-modify-writes that vendor always executes got skipped on one control-flow path in the rebuild. Emulator didn't exercise the bit-31-set entry state so the missing writes never showed up in the trace. On silicon where that bit is live, silicon-hostile.
Class 3: port implemented a DDRPHY training-bank save/restore routine
as read-and-discard. Vendor writes via str wzr; our port only
ldred. Caller (fn_9a90) never reached under the happy-path
LP5-2400 cold-boot trace, so mmio_diff didn't fire. On silicon with
the caller active, training coefficients leak between phases.
All six would have bricked or mis-trained silicon. All six were invisible to write-sequence diff.
See Simulation stack for the full reference. New or hardened:
sim_tripwire.py — Bin-style per-access tracer on Unicorn; (seq, tick, pc, addr, size, rw, val, region, fn_name) records with PC→fn resolutiontripwire_diff.py — PC-bucketed SequenceMatcher diff; bucket by fn_name to survive bitflip-path control-flow divergencetraining_sim.py — two-mode DDR training simulator (pass / bitflip-first-N-reads)bitflip_sweep.py — per-address retry convergence test over all training-status addressesmmio_regions.py — shared address → region tag classifier (DDRCTL, DDRPHY, OTP, SRAM, CRU, …); fixed SCRAMBLE→OTP at 0xFECC0000 after TRM cross-checkaudit_data_syms.py — scans every candidate.c for DAT_/s_/BLOB_DATA_ externs, cross-checks against DATA_SYMS | PORT_OVERRIDES | MMIO_SYMS (case-insensitive)audit_early_return_tail.py — static ARM64 asm scanner for cond_br → short block with mov #const → b INTO_TAIL_WITH_STR patterns; flagged 15 candidates, 1 real bug (fn_3268), 1 different-class bug (fn_1c14), 13 false positivesreloc_splice.py gained a post-link ADRP-to-NULL guard — scans each linked .text for any ADRP whose resolved page is 0x0 and emits WARN <port>+<off>. Closes bug-class 1 at build time.
All wired into make audit.
See Port matrix for the full table.
str wzr; port now does the same.ddr_annotated.c:9695–10640 (LPDDR5 frequency-band timing programmer). 27 callees resolved via fun_table. 24 new DAT_00011ff0..DAT_000127c0 defsyms added to DATA_SYMS. Currently parked in splicer_skip.txt pending investigation of a 1-bit divergence at tp[0x4f] — see internal task #198.23 training-status addresses flipped one-at-a-time on vendor LP5-2400:
fn_2340 writes MRCTRL0 = 0x60 instead of 0x10 — vendor's intended mr_type retry strategy, replicated correctly by the rebuild.
The sweep is the pre-silicon evidence that the rebuild's retry logic
converges across all plausible transient status faults. Bitflip mode
doesn't degrade tripwire_diff because the buckets key on fn_name
not seq_idx, so control-flow divergence just reshapes buckets.
mmio_diff 3173/3173 greenmake audit green on data-symbol coverage + early-return-tailsplicer_skip.txt: one entry (154_FUN_de40 until #198 closes)tripwire_diff finds 1 SUSPECT (fn_ac8 vendor early memcpy,unrelated) and 3 minor-diffs all explained (SWSTAT toggle,
SCRAMBLE→OTP off-by-one, ''fn_8b40'' extra polls)
cd ~/projects/AMPere/benchmark && make verify # expects 3173/3173 green
If green, pick task #198 or any pending. Task #198 investigates the
1-bit tp[0x4f] divergence in fn_de40's install trial — details in
the internal task board.
“Markus' insistence on simulation before flashing paid off. Big time. Again.” — 2026-04-21.
The tripwire + PC-bucketed diff caught 3 silent NULL-derefs that were
hiding under mmio_diff 3173/3173 green. ld
–unresolved-symbols=ignore-all zeroed undefined DATA_SYMS
externs into page 0x0, which emulator reads happily returned 0 for,
masking the bug in write-sequence equality. Silicon would have bricked.
mmio_diff was the gate we trusted. The gate was passing. The simulator layer — with a tripwire-style per-access capture, not just write-order comparison — is not optional, even late in a campaign that feels “done”.