Drafted overnight 2026-04-16 by two independent AI agents (Sonnet 4.6 + Opus 4.7) given identical context. Both rank top-3 suspects differently; the overlap is the high-confidence signal, the disagreement is where you'll learn the most tomorrow. This page exists because the user wanted something readable from their phone during commute.
dclk_vp2 clock parent / CRU mux is a top-3 suspect in both strategies. The first test of tomorrow's session should be cat /sys/kernel/debug/clk/clk_summary | grep dclk_vp2 on the running kernel — costs 30 seconds, potentially decisive.vop2trace.ko only hooks regmap_write — both agents independently flag this as the biggest blind spot. Extend to regmap_update_bits_base + clk_set_parent + arm_smccc_smc before the next deep dive.| Sonnet's #1 bet | Opus's #1 bet | ||
|---|---|---|---|
| Suspect | dclk_vp2 wrong CRU mux parent | HDPTX PHY not producing a recovered pixel clock (partially owned by BL31 via SMC) | |
| Why | Register diff is byte-identical everywhere EXCEPT the mux trees that don't appear in MMIO reads; classic “frequency right, source wrong” | /dev/mem zeros at 0xFED70000 suggest secure-world / syscon indirection; BL31 may own PHY init |
|
| First experiment | clk_summary diff, then mw.l the right CRU mux word from u-boot console for zero-cost test | Upgrade vop2trace.ko to catch regmap_update_bits + arm_smccc_smc, capture full PHY init trace |
|
| Time cost | ~30 minutes | ~2–3 hours |
Resolution strategy: they're not actually incompatible. Sonnet's path is cheap-and-definitive if the mux IS wrong; if it isn't, Opus's deeper instrumentation becomes necessary. Run Sonnet's test first, escalate to Opus's plan if it comes back matching.
Appended after the tonight session — resolution of open questions from the top of this page.
Sessions 1–4 were heuristic register matching. Tonight was empirical disambiguation — targeted experiments that either closed a hypothesis or reopened it more sharply.
drive the panel correctly — DP TX, PHY, cable, panel, backlight are all
healthy. * **Does the panel see our stream?** Closed, yes. DPCD ''SINK_STATUS'' (0x205) reads ''IN_SYNC'' after commit. Link trains at HBR×2 without errors. * **Alpha=0 stripe-paint bug.** Closed, fixed in v10. Vendor clusters blend incoming pixels against ''VP2_DSP_BG=black'' with the source alpha; alpha-0 means every pixel was being multiplied to black before ever hitting the DSP_IF. A real content bug, unrelated to VOP2 vs DP. * **''dclk_vop2'' parent selection.** Closed, last concrete u-boot side-win of the night. Rate was ~136 MHz instead of the kernel 147.69 MHz; fixed by pre-selecting ''V0PLL'' before ''clk_set_rate()'' so the u-boot clock driver takes the retune path rather than the nearest matching divider on the default ancestor. * **Vendor u-boot as eDP-logo reference.** Closed, negative. Built vendor ''coolpi-loader'' from source (branch ''linux-6.1-stan'', on new CT ''ranke''). Factory ''genbook_spi.img'' boots kernel cleanly but shows no logo on eDP or HDMI either. Rockchip wiki explicitly states ''rockchip_show_logo()'' is Android-only, not implemented for Linux. The vendor-knows-how assumption is dead.
bug lives. Not concrete without more instrumentation — we know the
fault is upstream of DP TX but cannot yet pinpoint whether it is the VOP2 pixel-output chain, the content format handed to the eDP controller, or something in between. * The original TL;DR framing here (dclk_vp2 mux parent, HDPTX PHY owned by BL31) has been overtaken by BIST+IN_SYNC results: the PHY is producing valid output, so those hypotheses no longer match the data. Keep them for the archive but do not bet on them tomorrow.
The conclusion is that eDP-logo-in-u-boot is not cleanly solvable without
a bigger investment. See Project Bin for the full tonight notes,
the vendor-u-boot detour on CT171 ranke, and the open tomorrow list
(idblock extract, coolpi_rk3588_gbook_nor_upgrade.img test, or
accepting that pixels-in-u-boot is not a today problem).
For: upstreamable u-boot patch series. No register soup. No cargo-culting.
This is the most likely culprit and it's structurally invisible to register diffing. The symptom — every VOP2 and eDP register matches, link training passes, panel backlight is on, but no pixels — is classically consistent with “wrong pixel clock source.”
For eDP on RK3588, the pixel clock for VP2 (dclk_vp2) must come from the HDPTX PHY's recovered/divided clock output, not from a CRU PLL. The CRU has a mux for this, and if u-boot's clk_set_rate(dclk_vp2, 147.84 MHz) walks the wrong mux tree, it configures a PLL-derived clock at the right frequency but the VOP2 and eDP TX are running on unrelated oscillators. They're never synchronized. No pixels. The registers all “look right” because frequency ≠ source.
The kernel's phy-rockchip-samsung-hdptx driver does a two-step: it programs the PHY PLL to produce the desired pixel rate, then reconfigures the CRU dclk_vp2 mux to select clk_hdptx1_pixel_io as the parent before enabling the VOP. u-boot almost certainly skips the mux reassignment.
Why u-boot misses it: u-boot's CCF support for RK3588 is partial. Many mux nodes are present but “fixed” or default to the PLL path. The clk_set_rate call may succeed (hitting a PLL), return the right rate, and never touch the mux.
The VO1_GRF (0xFD5AC000) contains mux bits that route VP2's parallel output bus to the eDP1 TX data lines rather than HDMI/other sinks. If these aren't written, the VOP2 scans out into a dead bus while eDP is receiving nothing. Your register diffing confirmed you haven't touched VO1_GRF at all in u-boot. The kernel writes it during rockchip_vop2_bind() setup, and it's write-once-per-boot.
This one is slightly lower confidence than the clock because link training passing suggests the PHY did get initialized somehow — but DP link training succeeds purely over AUX channel which is independent of the video data path. You can have a fully trained DP link and zero video pixels if the parallel bus from VOP2 is unrouted.
The /dev/mem zeros at 0xFED70000 are suspicious and need resolving before you can rule this out. The HDPTX combo PHY has a large internal state machine; “link training passes” proves the AUX channel works but AUX goes through a separate path from the high-speed TMDS/DP serial lanes. The zeros could mean: (a) the registers are banked/indirect and /dev/mem isn't hitting the real state, or (b) the kernel has handed the PHY to a power domain and it's invisible to user-space MMIO, or © u-boot's PHY init is genuinely incomplete.
The reason this is #3 rather than #1: if the PHY high-speed lanes were truly dead, you'd expect link training to fail or DPCD to be unreadable. Since both work, the main PHY init probably ran. But there may be a pixel clock enable step or lane swap configuration that's separate from link training init.
What to do: On the running kernel (display working), read the clock tree for dclk_vp2:
cat /sys/kernel/debug/clk/dclk_vp2/clk_summary ====== or ====== cat /sys/kernel/debug/clk/clk_summary | grep -A3 dclk_vp2
Also capture the full CRU register block for the VP2 clock mux word while display is live:
devmem 0xFD7C0180 32 # adjust offset per TRM — CRU_CLKSEL_CON for dclk_vp2 ====== read surrounding 16 registers to catch the mux bank ======
Then: Boot with u-boot's eDP init, break into u-boot shell, and use clk info dclk_vp2 or read the same CRU register via md.l 0xFD7C0180 16.
Confirms Bet #1 if: The kernel shows parent: clk_hdptx1_pixel_io (or similar HDPTX-sourced name), and u-boot shows parent: vpll or cpll or any CRU-internal PLL. That's the bug. Write a 4-line CRU mux fixup in your eDP probe function, re-flash, done.
Rejects Bet #1 if: Both show the same parent name.
What to do: While kernel display is running, dump the VO1_GRF region:
for offset in $(seq 0 4 128); do printf "VO1_GRF+0x%03x = 0x%08x\n" $offset $(devmem $((0xFD5AC000 + offset)) 32) done
Save this. Boot into u-boot, dump the same region via md.l 0xFD5AC000 0x20.
Then: Diff the two. Any bit that's 0 in u-boot and non-zero in the kernel is a candidate missing write.
Cross-reference against TRM Part 2 “VO1_GRF” register map to identify which bits are vp2_dsp_if_mux vs. other noise.
Confirms Bet #2 if: There's a mux control register that the kernel sets to route VP2→eDP1 and u-boot leaves at reset value (typically 0 = routed to HDMI or first default sink).
Your existing vop2trace.ko hooks regmap_write. Extend it to also hook regmap_update_bits — this is what you're missing for the HDPTX PHY driver.
// In vop2trace.ko, add: static int handler_update_bits(struct kretprobe_instance *ri, struct pt_regs *regs) { // arg0 = regmap*, arg1 = reg offset, arg2 = mask, arg3 = val unsigned int reg = (unsigned int)regs->regs[1]; unsigned int mask = (unsigned int)regs->regs[2]; unsigned int val = (unsigned int)regs->regs[3]; pr_info("regmap_update_bits: reg=0x%x mask=0x%x val=0x%x\n", reg, mask, val); return 0; }
Also: use ftrace to capture the full call sequence for phy-rockchip-samsung-hdptx:
echo 'phy_rockchip_samsung_hdptx*' > /sys/kernel/debug/tracing/set_ftrace_filter echo function > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_on ====== trigger a display modeset (blank/unblank or dpms cycle) ====== echo 0 > /sys/kernel/debug/tracing/tracing_on cat /sys/kernel/debug/tracing/trace > /tmp/hdptx_trace.txt
This gives you the exact function call sequence the kernel uses during PHY init. Map that against what u-boot calls.
Before flashing a full fix, test your dclk_vp2 mux theory cheaply: add a u-boot command that writes the CRU mux select bits via mw.l before the VOP enable sequence runs. No recompile needed for the first test — just type it in the u-boot console. If pixels appear, you have your answer in 2 minutes.
After u-boot's eDP init, read DPCD register 0x00200 (DP_SINK_STATUS). Bit 0 = port 0 in sync, bit 1 = port 1 in sync. If the panel's DP receiver actually saw valid video symbols on the lanes, these bits will be set. If they're clear, the panel received nothing usable on the high-speed lanes despite link training passing.
Read it via AUX in u-boot:
analogix_dp_read_byte_from_dpcd(dp, DP_SINK_STATUS, &sink_status); printf("DP_SINK_STATUS=0x%02x\n", sink_status);
This is a zero-cost probe that works without probing hardware and definitively answers “did the panel see video data.”
Load a kernel module that writes a known-wrong value to the CRU dclk_vp2 parent mux while the display is running. If the display dies immediately, you've confirmed the mux is load-bearing live. More importantly: write the value that matches what u-boot currently writes. If that breaks the kernel display, that's your bug replicated in software with zero ambiguity.
// kill-dclk.ko: write the reset/PLL-default mux value iowrite32(WRONG_MUX_VAL, cru_base + CLK_SEL_CON_FOR_DCLK_VP2);
Before any of the above, try the cheap software version:
echo "cpll" > /sys/kernel/debug/clk/dclk_vp2/clk_parent # or whatever PLL name
If the display dies, reparent it back. This tells you whether the parent is actually meaningful without writing a single line of C.
Check whether the kernel makes any ROCKCHIP_SIP_* SMC calls during eDP bring-up that u-boot doesn't. On the running kernel:
====== Check if BL31 mediates any display init ====== dmesg | grep -i "sip\|bl31\|atf\|smc\|psci" | grep -i "vop\|edp\|hdptx\|disp"
Also check /sys/kernel/debug/rockchip_sip/ if it exists. u-boot's BL31 interface is minimal; if the kernel depends on a BL31 call to configure something (clock, power domain, memory bandwidth reservation), u-boot might silently succeed without it.
Honest assessment: Full u-boot simulation of this specific path is expensive. QEMU's RK3588 board model doesn't exist in mainline and building one is a multi-week project.
What's actually practical:
rk3588_edp_enable() sequence into a standalone C program that uses a mmap'd anonymous buffer instead of MMIO. Run it, then compare the buffer contents (as if they were registers) against the expected kernel values. This catches logic errors (wrong offset, wrong shift, wrong mask) without hardware. Cost: ~4 hours. Benefit: catches 80% of “wrote the wrong register” bugs without a flash cycle./dev/mem on a freshly-booted system before the kernel has had a chance to configure anything. If that makes u-boot's framebuffer appear, you've proven the sequence is sufficient. This is a creative but legitimate way to test “is the kernel sequence necessary and sufficient” without writing driver code.Flash cycles are 30-60s which is fast enough that the user-space mock is borderline worth it. My recommendation: skip the mock and use the ftrace replay approach. It produces a reproducible script, which is also a useful artifact for the upstream patch description.
In rough order of likelihood that u-boot is missing it:
clk_set_parent(dclk_vp2, hdptx_pixel_clk) after PHY PLL lock. u-boot's CCF clk_set_rate call likely takes the default PLL path and never touches the mux.regmap_write(vo1_grf, VO1_GRF_DP_DSP_IF_MUX, …) during rockchip_vop2_bind. Kernel has GRF as a syscon regmap; u-boot has to do this explicitly.phy_power_on() vs. phy_configure() distinction; u-boot may call configure but not the final pixel-clock-output-enable step.pm_runtime_get_sync() with real delays baked into the power domain driver. u-boot's power domain enable is faster and may not respect the settling time in the HDPTX PHY datasheet.pinctrl_select_state(dev, “active”) for eDP data pins. u-boot pinctrl is minimal; if the kernel sets DP-specific pin functions that differ from reset defaults, u-boot gets reset-default pin functions which may be wrong.These are debugging tools only (don't go upstream):
These produce upstream-safe artifacts:
rk3588_edp_enable() or the clock driver, with a comment citing TRM section and explaining why eDP requires HDPTX-sourced pixel clockif (endpoint == VOP2_EP_EDP1)usleep_range() calls matching the panel datasheet minimum settling timesFor the upstream patch description: the ftrace replay is gold. You can say “captured kernel MMIO sequence via ftrace, identified missing CRU mux write at offset X, verified by replicating in isolation” — that's a credible, reviewable rationale.
cat /sys/kernel/debug/clk/clk_summary | grep dclk_vp2. If parent is HDPTX-sourced, you found Bet #1. Immediately test the mux fix via mw.l in u-boot console before recompiling anything.regmap_update_bits. Trigger a display modeset cycle and capture the full HDPTX PHY + VOP2 write sequence. Cross-reference with what u-boot does.I'd start with step 1. The clock parent check costs 30 seconds of typing and if it's wrong, everything else is moot. The “every register matches but no pixels” symptom combined with a PHY-derived clock requirement is so classically clock-domain mismatch that I'd bet a round of drinks on it.
Written 2026-04-16. Strategy assumes kernel 6.x+ on custom rk3588-marfrit tree.
You've done the obvious work. Registers match. Link trains. Backlight glows. Something the kernel does between regmap_update_bits() calls — or before your trace window even opens — is missing. Here's where I'd put my money.
My top three, ranked by “worth burning tomorrow on”:
Probability: high. You see register-byte-identity between u-boot and kernel, and link training succeeds — that's the AUX channel (side-band, ~1 Mbps, doesn't need the main link clock). The main link symbols are what carry pixels. If HDPTX's TX PLL isn't locked, or dclk_vp2 is still being fed from a CRU PLL instead of the PHY's link_clk output, VP2 scans out into a void — AXI reads happen, pixels get formatted, but the serializer at the PHY is either fed garbage or nothing. Panel sees idle symbols forever, never transitions to video stream. Backlight stays on because HPD + power sequencing are independent.
The fact that HDPTX registers at 0xFED70000 read back as mostly zeros via /dev/mem is a giant red flag. Either (i) the region is syscon/regmap-indirected (kernel writes go through a different aperture), (ii) it's clock-gated when you read (PHY APB off in idle), or (iii) there's a secure-world filter and BL31 owns it. Any of those three means u-boot's direct-register approach silently no-ops.
Probability: high, and partially overlaps with (a). Even if the HDPTX PHY is up, if dclk_vp2 is running from a CRU PLL at 147.84 MHz while VP2 hands its pixels off expecting the PHY's link_clk/N domain, you get an async FIFO underrun at the DSP_IF boundary. Symptoms: VP2 sees valid timing, shovels pixels, AXI reads RAM, but downstream the eDP MAC never sees a valid SDP stream. No pixels.
This is a classic u-boot CCF gap: the driver calls clk_set_rate(dclk_vp2, X) and trusts the framework, but the mux-parent reassignment that the kernel does via clk_set_parent() in rockchip_phy_ops.init() isn't modeled.
Probability: medium. VO1_GRF at 0xFD5AC000 has ~16 bits that determine whether HDPTX1 drives DP0 or DP1, which VP maps to which DSP_IF, and the lane swap. If u-boot's BL31 handoff left these in a maskrom-default state (or ATF wrote a different routing for HDMI testing), the eDP symbols are being emitted from the PHY onto the wrong lane pair. Training passes because AUX is its own pair. Main link goes to unconnected pads. Panel never locks video.
Our vop2trace.ko only hooks regmap_write. Upgrade it now:
regmap_update_bits_base, regmap_bulk_write, regmap_multi_reg_write, regmap_noinc_write, and rockchip_grf_write.clk_set_rate, clk_set_parent, clk_prepare_enable, clk_core_set_rate_nolock — we need the entire clock tree mutation timeline, not just register touches.arm_smccc_smc to catch any SMC calls the PHY driver or CRU driver makes to BL31. Log: caller PC (_RET_IP_), x0..x7.
Measure: Boot kernel with display disabled at DTS level, then load analogix_dp + phy-rockchip-samsung-hdptx at runtime with our trace armed. We capture every write that brings the hardware from cold-dead to driving-SDDM.
Confirms: If we see arm_smccc_smc(FUNCID=RK_SIP_ACCESS_REG, …) writing to 0xFED70000 range — bingo, PHY is secure-world. If we see clk_set_parent(dclk_vp2, hdptx1_link_clk) — that's our clock bug (b). If we see massive 200+ RMW sequences on HDPTX during PHY init — we're under-initializing the PHY and regmap_write only logged a fraction of real traffic.
In Linux: cat /sys/kernel/debug/clk/dclk_vp2/clk_parent and walk the tree upward. /sys/kernel/debug/clk/clk_summary | grep -E “dclk_vp2|hdptx|link”. Compare to u-boot's clock driver default parent.
Confirms: If kernel shows hdptx1_phy_pll_link_clk as an ancestor and u-boot has gpll or v0pll, that's hypothesis (b) confirmed without any further work.
Patch the kernel (live kprobe-based write-suppression) to skip specific register writes during display init, one subsystem at a time:
clk_set_parent for dclk_vp2 → still up?Measure: Each suppressed class that breaks the kernel tells us “u-boot must do this too.” Each that doesn't break tells us “safe to ignore.” Converts “what must we port?” from guesswork to data. Do this with HPD already up so we're only testing video-path writes, not power sequencing.
From u-boot, after you think you've committed, poll DPCD register 0x200 (SINK_COUNT) and 0x205 (LANE_ALIGN_STATUS_UPDATED) via AUX. Then — more telling — read DPCD 0x201 (DEVICE_SERVICE_IRQ_VECTOR) and check for CP_IRQ or AUTOMATED_TEST_REQUEST. Also poll 0x206/0x207 for symbol-lock per lane *during* video transmission, not just post-training.
Confirms: If lanes show symbol-lock lost after training completes and MSA should be flowing, the PHY is emitting invalid symbols — hypothesis (a) or ©. If lanes stay locked, the pixel pipeline upstream of the PHY is the problem, not the PHY itself. This is a binary search across the chain without a logic analyzer.
From the kernel, once up, devmem2 0xef700000 w — read what u-boot left in the framebuffer. Is it actually the vidconsole glyph pattern, or is it zeros / garbage? If glyphs are there, the AXI reader path and VP2 are not the problem, VP2 is consuming the buffer fine. If the buffer is blank, VP2 isn't even reaching AXI — different failure mode, probably PD_VOP power domain.
phy_init_done, link_train_complete, commit_done). Free low-bandwidth debug channel. Zero instrumentation cost. Side-steps the “is u-boot even getting there” question.ftrace the arm_smccc_* calls during phy-rockchip-samsung-hdptx probe. If BL31 owns PHY registers, a u-boot PHY driver that bangs MMIO directly is writing into /dev/null and the kernel's init via SMC is what actually programs it.echo N > /sys/module/phy_rockchip_samsung_hdptx/parameters/… if any module params exist, or live-patch the kernel's HDPTX driver to skip its post-link-training PHY PLL re-lock sequence. If skipping it breaks the kernel's display, u-boot is missing that exact sequence.clk_summary diff: Dump it once when kernel has display up, dump it in u-boot (we can print CRU register state from u-boot), diff. Every row that differs is a candidate gap. More productive than comparing MMIO registers one-by-one.TEST_PATTERN): if the panel supports it, ask the panel to display a solid color via DPCD. If that works, panel + link are fine and the entire fault is upstream of the eDP MAC. Very surgical.probe(), one after each major init step. Latency skew between kernel-first-working and u-boot-seems-done tells us whether we're racing a slow PLL lock.
Yes, and cheap: user-space mock-MMIO harness. Take the u-boot driver files verbatim. Provide shims for readl/writel/regmap_*/clk_* that log to stdout and maintain an in-memory shadow of register state. Drive the probe from a main(). Cost: ~4 hours of shim plumbing. Payoff: you can run the driver under valgrind, gdb, rr, bisect logic bugs at C-level speed, and diff the shadow against our Linux-side capture. This catches every “I forgot to set bit 3 in OVL_CTRL” class of bug before you ever flash.
Not worth it: full QEMU with fake PHY. The RK3588 QEMU model doesn't exist in any useful form for VOP2, and writing one is a side-quest bigger than the original project.
Recommended simulation scope: just MMIO + clk tree + regmap. Skip PHY modeling. The harness only needs to validate *“did the driver attempt the right register sequence”*, not *“did the pixels come out”*. That's the logical half of the bug, and the cheap half to catch.
clk_set_parent() — CCF magic the kernel does that u-boot's clock driver may not model.regulator-boot-on + startup-delay-us properties. u-boot typically just enables GPIOs; kernel honors the delays./sys/kernel/debug/pm_genpd/pm_genpd_summary on the running kernel.SET_POWER) with full D0 transition + 20ms delay. u-boot often shortcuts this.Two separate hats:
Documentation/rockchip-vop2-debug.md at best.rockchip,secure-regs DT binding + SMC helper, not a hardcoded smc #0x82000012 with no binding. If we find a missing clock parent change, the fix goes in the Rockchip CRU driver's clock-mux tables, not as a hack in the VOP2 driver. Keep the architectural layer discipline; maintainers will reject “kitchen sink in vop2.c.”
Mock harness is particularly valuable as upstream artifact: package it as a unit test under test/dm/vop2_mmio.c. First u-boot VOP2 unit test. Maintainer will love it.
clk_summary diff (kernel live vs u-boot state). Cheapest info/effort ratio on the table. This alone may crack hypothesis (b).vop2trace.ko with regmap RMW + clk_set_parent + arm_smccc_smc hooks. Capture clean kernel-brings-display-up trace. This is your ground truth.regmap_write calls with address in 0xFED70000..0xFED80000. If display survives, PHY is owned by BL31 and we've been writing to void. If display breaks, we need every bit.I'd start with clk_summary. Not because it's the most likely answer, but because fifteen minutes of work either eliminates or confirms an entire hypothesis branch, and you go into the hard work with less unknown. The HDPTX-secure-world question is bigger but requires the extended kprobe module first, which is the afternoon's work.
If clk_summary shows HDPTX is in the tree and dclk_vp2 parent is right, drop straight into hypothesis (a) via the extended trace module. The PHY-via-SMC question is, I think, your actual bug. But prove it, don't assume it.
Back to Project Bin