User Tools

Site Tools


bin_42c3_timeline

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

bin_42c3_timeline [2026/04/18 21:48] – created via HIS marfritbin_42c3_timeline [2026/04/18 21:54] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +====== Bin — 42C3 timeline ======
 +
 +Narrative session-by-session timeline of the Bin campaign shaped for a 42C3 talk.  Companion to <code>bin</code> (the live technical page) and <code>bin_debug_strategy</code> (the strategy dossier).
 +
 +===== Title candidate =====
 +
 +"Teaching a Rockchip laptop to draw pixels — or how we caught the kernel cfg_done dance red-handed with a 2 GB debug ring-buffer."
 +
 +===== Arc (one beat per slide) =====
 +
 +==== Act I — the problem ====
 +
 +  - **The board**: CoolPi CM5 GenBook, RK3588, €200 Chinese laptop, eDP panel, mainline-hostile vendor BSP.
 +  - **Goal**: boot upstream u-boot → upstream Linux → see pixels.  Not heroic; entitlement-level.
 +  - **Wall**: link trains at HBR×2, backlight on, panel stays black.  Register state byte-for-byte matches kernel post-modeset.  Yet no pixels.  Months of piecemeal /dev/mem snapshot RE could not explain it.
 +
 +==== Act II — building a debug primitive ====
 +
 +  - **The ReCAP protocol**: memory files + DokuWiki + WIP branches that survive context compaction.  Preserve hypothesis state so a multi-session campaign stays coherent.
 +  - **Tripwire**: shared 2 GB DDR no-map reservation, u-boot + kernel both append (tick, caller PC, phys addr, value) records for every writel/readl.  Kernel walks page tables to resolve virt→phys inline.  67M-record capacity.  Dumper reads via /dev/mem; symbols via /proc/kallsyms bisect.
 +  - **Phase 1**: u-boot with CONFIG_BIN_PHASE1_NOINIT (zero VP2 writes) + tripwire kernel.  SDDM comes up on eDP regardless.  Kernel DRM cold-inits the whole display without u-boot help.  **Disarms hypothesis 3**: kernel is NOT dependent on u-boot half-init.
 +  - **Phase 2**: kernel-only capture via panel_edp unload/reload cycle.  77K records.  Decoded the atomic_commit sequence: VP0 and VP2 post-config blocks, two-phase cfg_done, VP output mux writes.  The reference trajectory we had been unable to get from snapshot RE.
 +  - **Phase 3**: u-boot with tripwire armed from first vop2_probe + full VP2 init.  4.3 M records (2.08 M u-boot + 2.24 M kernel).  Zero lost.  First time the u-boot side of the handover is captured at the same fidelity.
 +
 +==== Act III — the honest failure arc ====
 +
 +  - **Bit-level diffs found**: cfg_done (we latch VP2 alone, kernel latches VP0+VP2 two-phase); 0x0600 bit 31 set by us (STANDBY), cleared by kernel; 0x06f0 byte-lane pattern missing two lanes.
 +  - **Confirmed NOT bugs**: 0x0e3c=0x10001000 (POST_SCL_CTRL), previously flagged as cargo-cult in rkr5 review — tripwire proves kernel writes the same value.
 +  - **Phase 4**: apply the three fixes.  Panel still black.  Not the bug.
 +  - **Phase 5**: Analogix DP core internal PLL (PLL_REG_2..5) never touched by our u-boot.  Port kernel init.  Over-added ANALOG_CTL_1=0x3f which powered all TX lanes DOWN — regressed.
 +  - **Phase 6**: trim to kernel-verified writes only.  Panel still black.
 +  - **Phase 7+8**: timing hypothesis — maybe PLL lock is slow, u-boot hands over before first frame.  mdelay(10000) after cfg_done + backlight + stripe-paint.  Panel black for the full 10 s.  **Timing definitively ruled out.**
 +
 +==== Act IV — the new frontier ====
 +
 +  - **The revelation**: kernel dmesg shows 8× POST_BUF_EMPTY irq err at vp2 on FIRST modeset attempt, then a SECOND attempt succeeds and panel comes alive.  Alongside: [drm] Missing drm_bridge_add() before attach.
 +  - **Reframe**: bug likely is not register values but **bridge registration ordering**.  Analogix DP bridge attach happens before bridge registration.  Kernel retries, u-boot cannot (no bridge framework).
 +  - **Transferable wins**: tripwire as a generic primitive.  The "capture u-boot + kernel in one DDR ring across handover" pattern works for any arm64 SoC.  Upstreamable tool.
 +  - **Honest outro**: campaign continues.  No cheering, no victory lap.  The bug was not what we expected, the fixes we made were correct but not curative, and the tool survives the investigation.
 +
 +===== Narrative hooks to develop =====
 +
 +  * **Cargo cult decoded** — how a single register value (0x10001000 at POST_SCL_CTRL) accreted as "probably magic, worth investigating" across three sessions, until tripwire proved kernel writes it verbatim.
 +  * **The backlight tell** — Phase 7 had the backlight off during the entire timing-test window, making the result unreadable.  One-slide cautionary tale.
 +  * **Who cleared the buffer?** — "brown text flash" was plymouth painting.  Our stripe paint happened but was overwritten before anyone could see.  Invisible work is still work.
 +  * **Hypothesis ladder** — diagram of every theory disarmed across the campaign (warm-PHY bisection, DSP_LUT_EN bit 28, Cluster1 offsets that were a Python bug, IOMMU-inherited-state, POST_SCL_CTRL cargo cult, timing, bit-level register diffs).
 +
 +===== Submission =====
 +
 +  * 42C3 CFP deadline typically mid-September.
 +  * Target track: Hardware and Making, or Security.
 +  * Duration: 45 min + 15 min Q&A.
 +  * Format: slides + live demo of tripwire dumping a capture from ampere.
 +  * Co-pilot credit: sibling Claudes across the sessions.  Flag explicitly — the honest failure arc **is** the ReCAP story.