Table of Contents
MegabitChip — Session 2026-04-21 (Extended)
Extended session on top of the reloc-splice + mmio-diff work landed earlier the same day. Focus: close the gap between write-sequence equality (mmio_diff green) and actually running on silicon without bricking. Tools, audits, and three new monster ports shipped; six silicon-hostile bugs caught pre-flash across three bug classes.
TL;DR
- mmio_diff baseline held at 3173 / 3173 across the whole session.
- Three bug classes, six concrete bugs, all found and fixed without touching silicon.
- Three remaining “monster” functions ported (fn_fcc4, fn_1c14, fn_de40).
- Bitflip sweep: pre-silicon evidence the rebuild's retry logic converges under all plausible transient status faults.
Six silicon-hostile bugs caught pre-flash
| # | Class | Case |
|---|---|---|
| 1 | ld unresolved → 0 NULL deref | fn_9a68 DAT_00012B70 case-mismatch |
| 2 | same | fn_7730 DAT_00010ba8 missing from DATA_SYMS |
| 3 | same | fn_7730 DAT_00010c2c missing from DATA_SYMS |
| 4 | same | fn_7730 DAT_00012b50 missing from DATA_SYMS |
| 5 | C early-return skips shared tail | fn_3268 0x208 RMW pair skipped when bit-31 set |
| 6 | Port is read-only where vendor writes | fn_1c14 rebuilt as no-op; vendor save/restore DDRPHY training bank |
Class 1: ld –unresolved-symbols=ignore-all silently zeros undefined
externs. A case-mismatched or missing DATA_SYMS entry becomes an
adrp resolving to page 0x0, and ldr returns whatever junk lives
at zero. mmio_diff is blind to this because downstream MMIO writes
still match vendor.
Class 2: C port uses early-return where vendor's asm has the conditional branch jump into a shared tail. Two 0x208 read-modify-writes that vendor always executes got skipped on one control-flow path in the rebuild. Emulator didn't exercise the bit-31-set entry state so the missing writes never showed up in the trace. On silicon where that bit is live, silicon-hostile.
Class 3: port implemented a DDRPHY training-bank save/restore routine
as read-and-discard. Vendor writes via str wzr; our port only
ldred. Caller (fn_9a90) never reached under the happy-path
LP5-2400 cold-boot trace, so mmio_diff didn't fire. On silicon with
the caller active, training coefficients leak between phases.
All six would have bricked or mis-trained silicon. All six were invisible to write-sequence diff.
Tooling shipped this session
See Simulation stack for the full reference. New or hardened:
sim_tripwire.py— Bin-style per-access tracer on Unicorn;(seq, tick, pc, addr, size, rw, val, region, fn_name)records with PC→fn resolutiontripwire_diff.py— PC-bucketedSequenceMatcherdiff; bucket by fn_name to survive bitflip-path control-flow divergencetraining_sim.py— two-mode DDR training simulator (pass/bitflip-first-N-reads)bitflip_sweep.py— per-address retry convergence test over all training-status addressesmmio_regions.py— shared address → region tag classifier (DDRCTL, DDRPHY, OTP, SRAM, CRU, …); fixed SCRAMBLE→OTP at 0xFECC0000 after TRM cross-checkaudit_data_syms.py— scans everycandidate.cforDAT_/s_/BLOB_DATA_externs, cross-checks againstDATA_SYMS | PORT_OVERRIDES | MMIO_SYMS(case-insensitive)audit_early_return_tail.py— static ARM64 asm scanner forcond_br → short block with mov #const → b INTO_TAIL_WITH_STRpatterns; flagged 15 candidates, 1 real bug (fn_3268), 1 different-class bug (fn_1c14), 13 false positivesreloc_splice.pygained a post-link ADRP-to-NULL guard — scans each linked.textfor any ADRP whose resolved page is 0x0 and emitsWARN <port>+<off>. Closes bug-class 1 at build time.
All wired into make audit.
Monster ports
See Port matrix for the full table.
- fn_fcc4 — source-complete full port, 1684 B. Natural skip-larger. Documented source.
- fn_1c14 — full port, 656 B ≤ 740 B vendor. Replaces the broken read-only stub. Vendor writes via
str wzr; port now does the same. - fn_3268 — bug fix: C restructured so the 0x208 RMW pair runs on both control-flow paths, matching vendor's branch-into-tail shape.
- fn_de40 — source-scaffold, 4888 B ≤ 4912 B vendor budget. Faithful ~700-line port from
ddr_annotated.c:9695–10640(LPDDR5 frequency-band timing programmer). 27 callees resolved viafun_table. 24 newDAT_00011ff0..DAT_000127c0defsyms added toDATA_SYMS. Currently parked insplicer_skip.txtpending investigation of a 1-bit divergence attp[0x4f]— see internal task #198.
Bitflip sweep
23 training-status addresses flipped one-at-a-time on vendor LP5-2400:
- 18 of 23: single-read retry, all downstream writes unchanged — clean convergence.
- 3 of 23 (STAT CH1/CH2/CH3):
fn_2340writesMRCTRL0 = 0x60instead of0x10— vendor's intended mr_type retry strategy, replicated correctly by the rebuild. - 2 of 23 (MicroReset, MicroContMux): no retry fires on the LP5-2400 happy path — flip window isn't polled.
The sweep is the pre-silicon evidence that the rebuild's retry logic
converges across all plausible transient status faults. Bitflip mode
doesn't degrade tripwire_diff because the buckets key on fn_name
not seq_idx, so control-flow divergence just reshapes buckets.
Baseline state at session end
mmio_diff3173/3173 greenmake auditgreen on data-symbol coverage + early-return-tail- Splicer: 104 candidates / 85 spliced / 19 skip-larger / 0 failed
splicer_skip.txt: one entry (154_FUN_de40until #198 closes)tripwire_difffinds 1 SUSPECT (fn_ac8vendor early memcpy,
unrelated) and 3 minor-diffs all explained (SWSTAT toggle,
SCRAMBLE→OTP off-by-one, ''fn_8b40'' extra polls)
Next-session quick-start
cd ~/projects/AMPere/benchmark && make verify # expects 3173/3173 green
If green, pick task #198 or any pending. Task #198 investigates the
1-bit tp[0x4f] divergence in fn_de40's install trial — details in
the internal task board.
Observations
“Markus' insistence on simulation before flashing paid off. Big time. Again.” — 2026-04-21.
The tripwire + PC-bucketed diff caught 3 silent NULL-derefs that were
hiding under mmio_diff 3173/3173 green. ld
–unresolved-symbols=ignore-all zeroed undefined DATA_SYMS
externs into page 0x0, which emulator reads happily returned 0 for,
masking the bug in write-sequence equality. Silicon would have bricked.
mmio_diff was the gate we trusted. The gate was passing. The simulator layer — with a tripwire-style per-access capture, not just write-order comparison — is not optional, even late in a campaign that feels “done”.
