Table of Contents
fourier_attribution — Phase 5 review (2026-05-03)
Single-question matrix campaign. Operator: mfritsche. Reviewer: handover to fresh model instance per 8(+1)-phase loop Phase 5. Local artifacts: /home/mfritsche/src/fourier_attribution/ on noether.
Predecessors (substrate, NOT data): ohm_gl_fix (closed 2026-05-02), kwin_overlay_subsurface (closed 2026-05-03 without patch), x11-session-research (closed 2026-05-03 negative on both axes). This campaign acts on the replicate-baseline-first lesson — every binding cell number from in-session reps.
README (campaign overview, uncurated)
fourier_attribution
Single-question campaign on PineTab2 RK3568: for each of qt6-fourier, kwin-fourier, and chromium-fourier, is there a measurable, replicable benefit on bbb 1080p first-60s playback when that package is toggled off, vs the all-on baseline?
Spun off 2026-05-03 from the fourier umbrella, with strict scope discipline. The earlier fourier work (now dormant at ~/src/fourier/) digressed into fix-this-fix-that across browsers, mpv, libva, libplacebo, KWin, qt6, and kernel; this campaign separates wheat from chaff by attribution measurement under controlled package toggling.
In-scope
- Hardware: ohm (PineTab2 — RK3568, Mali-G52 MP2, hantro G1/G2 VPU)
- Workload: `bbb_1080p30_h264.mp4` first 70 seconds via `brave_drops_test.html`
- Three Arch packages: `qt6-base-fourier`, `kwin-fourier`, `chromium-fourier`
(chromium-fourier in this campaign = the binary at /tmp/chromium-ohm-gl-fix-step2/chrome — ohm_gl_fix Step 1 + Step 2 patches)
- Single Wayland session — Plasma 6.6.4 with KWin compositor
Out-of-scope (would have qualified before, scope-policed now)
- Any other fourier-flavoured patches (firefox-fourier, kernel-vb2-dma-resv, libva-v4l2-request port)
- Any patch authoring or upstream activity
- X11 sessions / non-compositing WMs (closed in `x11-session-research`)
- Δ_present-46ms side-finding (predecessor's hook; not the question here)
- mpv, gst-launch, or any non-browser client
- Other clips, other resolutions, other codecs
Predecessors (substrate, NOT data)
This is the fourth campaign in a chain on the same hardware target:
- [`../ohm_gl_fix/`](../ohm_gl_fix/) — closed 2026-05-02. Diagnosed Step 1 (libva-v4l2-request port) and Step 2 (Chromium WaylandConnection overlay-route) issues; produced the `/tmp/chromium-ohm-gl-fix-step2/chrome` binary used here.
- [`../kwin_overlay_subsurface/`](../kwin_overlay_subsurface/) — closed 2026-05-03 *without patch*. Premise (cage = 0 floor at N=1) didn't replicate at N=3. Lesson now codified in `feedback_replicate_baseline_first.md`.
- [`../x11-session-research/`](../x11-session-research/) — closed 2026-05-03 *with negative result on both axes*. X11 + non-compositing WM is *worse* than Plasma Wayland on this hardware/workload; X server doesn't program Plane 39 with NV12 regardless of client.
- This campaign (`fourier_attribution`) — opened 2026-05-03.
Per feedback_replicate_baseline_first.md: predecessor *measurement numbers* (drop counts, CPU%, freq) are reference history only. Every binding cell in this campaign anchors to in-session-acquired data.
Process
8(+1)-phase loop per ~/.claude/projects/-home-mfritsche-src/memory/feedback_dev_process.md. Phase 0 substrate captured 2026-05-03 across phase0_evidence/:
- `state_2026-05-03.md` — package state matches predecessor carry-over claim; revert paths clean (same-upstream-version stock packages cached)
- `test_rig_audit_2026-05-03.md` — predecessor test rig is intact; instrumentation reports per-second drops trajectory (sufficient for `drops_5s` / `drops_60s`); no Step 2 feature-flag opt-in needed (compiled in)
- `devfreq_probe_2026-05-03.md` — panfrost devfreq at `/sys/class/devfreq/fde60000.gpu/`; trans_stat + cur_freq parseable; governor `simple_ondemand` (dynamic — workload-sensitive signal)
Phase 1 binding cells locked at:
- `drops_5s` — drops in first 5s (warmup phase, file load + cache fill; reported but excluded from pass/fail)
- `drops_60s` — drops over the 60s steady-state window (5s–65s)
- `effective_fps` over 60s
- `browser_cpu_pct` — top sampling 5s–65s
- `kwin_cpu_pct` — same window
- `panfrost_mean_freq_mhz` — trans_stat-diff weighted mean over 5s–65s
- `panfrost_peak_freq_pct` — % of 60s spent at 800 MHz (Mali max)
Pass/fail threshold for “P delivers measurable benefit”: P-off cell increases drops_60s beyond all-on N=3 IQR, or any of {browser_cpu_pct, kwin_cpu_pct, panfrost_mean_freq_mhz} increases beyond its all-on N=3 IQR.
Note: this campaign uses 5s warmup boundary per operator instruction. The predecessor's phase3_prime_runs/post_process.sh uses 10s. The boundary divergence makes our drop counts not numerically comparable to predecessor numbers; that's intentional under campaign-contained data discipline.
Matrix
| Cell | qt6-fourier | kwin-fourier | chromium-fourier | Browser | Notes |
|---|---|---|---|---|---|
| A | on | on | on | chromium-fourier binary | baseline; no revert |
| B | on | on | off | brave-bin 1.89 | no package revert; binary swap only |
| C | off | on | on | chromium-fourier binary | revert qt6-base, logout+login |
| D | on | off | on | chromium-fourier binary | revert kwin, logout+login |
N=3 reps per cell. 12 reps total. 70s window per rep, 60s steady-state at 5s–65s.
Repository
Local only for now (private working tree). Push to gitea later if the campaign reaches a useful conclusion worth publishing.
phase0_findings.md (uncurated)
Phase 0 — fourier_attribution
This file is the umbrella reference for Phase 0 work. Detail evidence files in phase0_evidence/:
- `state_2026-05-03.md` — package + kernel + governor + session state on ohm
- `test_rig_audit_2026-05-03.md` — predecessor test rig (`brave_drops_test.html`, `run_browser_nodebug.sh`, `post_process.sh`) audit, Cell B Brave invocation finalised, Step 2 feature-flag question resolved
- `devfreq_probe_2026-05-03.md` — panfrost devfreq instrument verification
Research question (LOCKED 2026-05-03)
*“For each of qt6-fourier, kwin-fourier, and chromium-fourier on PineTab2 RK3568, is there a measurable, replicable delta on bbb 1080p first-60s playback when that package is toggled off (replaced by stock equivalent), versus the all-on baseline?”*
Mechanism the question targets
Each fourier package alters a different layer of the playback stack:
- `chromium-fourier` ships Step 1 (libva-v4l2-request hantro multi-planar / chromium-149 era port) and Step 2 (Chromium WaylandConnection patch — overlay-route engages on KWin). Predecessor evidence (`ohm_gl_fix/phase3_remeasure_2026-05-02/task23_per_frame_route.md`) shows Step 2 engages with the chromium-fourier binary; whether engagement translates to measurable benefit is what this campaign tests.
- `qt6-base-fourier` patches `qt6-base` for the `GL_ALPHA` stall. KWin links libQt6Gui; if KWin is the GL-composite step (per `wayland_baseline_2026-05-03/drmprobe_findings.md`, Plane 39 rotates ABGR8888 framebuffers under all reps), the qt6-fourier patch could affect KWin's per-frame work.
- `kwin-fourier` patches KWin for the `watchDmaBuf` fence-wait issue. Direct effect on KWin's compositor scheduling.
The campaign does not commit to any of these mechanisms. It only measures whether toggling each off shifts the binding cells.
Predecessor close-out summary (context, not data)
- `ohm_gl_fix` closed 2026-05-02 with the Step 2 WaylandConnection patch landed but Phase 1r `drops_post_warmup == 0` met *only* under cage. Spun off the residual cost into `kwin_overlay_subsurface`.
- `kwin_overlay_subsurface` closed 2026-05-03 *without patch*. Premise was “cage = 0 floor at N=1 from Phase 0” — at N=3 in the closing session, the floor was missing (post-warmup median 26).
- `x11-session-research` closed 2026-05-03 *with negative result on both axes*. X11 + xfwm4-no-comp produces *more* drops than Plasma Wayland with KWin compositing. X server doesn't program Plane 39 with NV12 regardless of client (mpv-xv, mpv-gpu — neither route engages).
The lesson from kwin_overlay_subsurface: don't anchor a campaign on N=1 historical numbers. This campaign acts on that lesson — every binding cell number comes from in-session reps acquired in this campaign's session.
Open questions before Phase 1 lock — resolved in Phase 0
- Chromium-fourier off → Brave or stock Chromium 149? Resolved: Brave (operator sign-off 2026-05-03). Two-major-version delta (Brave 147 vs chromium-fourier 149) documented as known confound.
- Reversibility on ohm — cell C/D revert path. Resolved: pacman cache has stock kwin-6.6.4-1 and qt6-base-6.11.0-2 at the *same upstream version* as the fourier-patched ones. Logout + auto-login per cell (operator instruction).
- 5s vs 10s warmup boundary. Resolved: 5s (operator instruction). Diverges from predecessor's 10s in `phase3_prime_runs/post_process.sh`.
- Step 2 feature-flag opt-in name. Resolved: there is no flag. Step 2 patch is compiled-in and engages by default.
- Window-size pin. Resolved: do not pin (mirror predecessor invocation; the video element is fixed at 800×450 in the .html anyway).
- Where does the orchestrator live? Resolved: new `~/fourier-attribution/` dir on ohm, separate from predecessor's `phase3_prime_runs/`. Carries `run_browser_attribution.sh` (devfreq capture added) and `post_process_attribution.sh` (5s warmup boundary; mean-freq + peak-freq% derived from trans_stat diff).
What Phase 0 has produced
- Locked research question + mechanism + experimental matrix (4 cells × N=3)
- Predecessor test rig audit + Cell B Brave invocation
- Panfrost devfreq instrument verification
- Phase 0 in-session baseline anchor of cell A — *to be acquired in task #54*
Once #54 lands, Phase 1 binding cells lock with cell-A IQR widths in evidence, and the pass/fail thresholds are concrete.
phase0_evidence/baseline_a_2026-05-03.md (uncurated)
Phase 0 — cell A in-session baseline N=3 — 2026-05-03
Cell A = all three fourier packages on (qt6-base-fourier 1:6.11.0-3, kwin-fourier 1:6.6.4-3, chromium-fourier 149.0.7812.0 binary), Plasma Wayland, KWin compositor.
Ambient conditions during reps (per `start.txt`)
- Kernel: `6.19.10-danctnix1-1-pinetab2`
- Active session: tty2 Plasma Wayland (kwin_wayland PID 53655)
- Operator's daily Brave (PID 58105 etc.) running in background, ~13% CPU idle. Documented as stable ambient confound across all three reps.
- Reps run consecutively, each ~95s wall, with ~10s settle gap between
- Reps timestamps: a_rep1 22:51 CEST, a_rep2 22:55 CEST, a_rep3 22:58 CEST
Per-rep summary (raw evidence in `baseline_a_2026-05-03/*_summary.txt`)
| metric | a_rep1 | a_rep2 | a_rep3 | min | max | range |
|---|---|---|---|---|---|---|
| drops_5s | 10 | 11 | 7 | 7 | 11 | 4 |
| drops_60s | 20 | 15 | 15 | 15 | 20 | 5 |
| drops_post_5s | 10 | 4 | 8 | 4 | 10 | 6 |
| effective_fps | 23.99 | 23.99 | 24.00 | 23.99 | 24.00 | 0.01 |
| frames_5s | 121 | 121 | 121 | 121 | 121 | 0 |
| frames_60s | 1441 | 1442 | 1441 | 1441 | 1442 | 1 |
| kwin_cpu_median | 12.0 | 12.0 | 12.0 | 12.0 | 12.0 | 0.0 |
| kwin_cpu_mean | 11.97 | 12.02 | 11.98 | 11.97 | 12.02 | 0.05 |
| kwin_cpu_iqr (per-rep) | 1.8 | 2.0 | 1.0 | 1.0 | 2.0 | 1.0 |
| browser_cpu_median | 56.6 | 56.25 | 56.0 | 56.0 | 56.6 | 0.6 |
| browser_cpu_mean | 61.03 | 61.69 | 61.17 | 61.03 | 61.69 | 0.66 |
| browser_cpu_iqr (per-rep) | 9.6 | 9.2 | 5.9 | 5.9 | 9.6 | 3.7 |
| panfrost_mean_freq_mhz | 600.1 | 591.2 | 594.6 | 591.2 | 600.1 | 8.9 |
| panfrost_peak_freq_pct | 35.3 | 34.0 | 35.0 | 34.0 | 35.3 | 1.3 |
| therm_pre_milliC | 45000 | 45555 | 45555 | 45000 | 45555 | 555 |
| therm_drift_c | 10.6 | 7.8 | 7.8 | 7.8 | 10.6 | 2.8 |
Cell A IQR widths (locked for Phase 1 thresholds)
The N=3 range is used as a conservative IQR proxy (with N=3, p25 ≈ min, p75 ≈ max). Tighter than 95% CI but appropriate for the small N.
- `drops_60s` range: 5
- `effective_fps` range: 0.01 (effectively zero — very stable)
- `kwin_cpu_median` range: 0.0 (exact tie across reps — top -d 1 rounds to integer percent at this CPU level)
- `browser_cpu_median` range: 0.6
- `panfrost_mean_freq_mhz` range: 8.9 (≈ 1.5% of mean)
- `panfrost_peak_freq_pct` range: 1.3 (percentage points)
Observations
- Effective fps is essentially locked at 24.0. The clip is 24 fps despite the `_30` in the filename. All three reps decode the full 60-second window's frames within 0.01 fps of each other. The test is *not* fps-bound; binding-cell discrimination must come from drops, CPU%, or panfrost freq.
- CPU metrics are exceptionally stable. kwin_cpu_median is 12.0% in all three reps to 0.1% precision. browser_cpu_median spans 0.6 percentage points. This means tiny shifts when cells B/C/D run will be detectable.
- panfrost_mean_freq_mhz is 591–600 MHz — well below the 800 MHz peak. Mali isn't pegged. The simple_ondemand governor is dialling between 400 / 600 / 800 MHz to track demand. Time at 800 MHz peak is consistently ~35% of the window.
- drops_60s ranges 15–20. Predecessor's `kwin_timing_nodebug_rep1` reported drops_total=27 (no campaign-data import — cited only as broad ballpark verification). Our numbers cluster slightly lower; not surprising given ambient-state differences (different uptime, different daily-Brave activity, different cache state). The campaign-contained data discipline holds: this is OUR baseline; comparison is against THESE reps' IQR.
- drops_post_5s spans 4–10. This is the largest relative variance (factor of 2.5×). Drops *during steady state* are the most rep-to-rep variable signal. Whether a cell-toggle moves this beyond the IQR will be the most stringent threshold test.
- Thermal drift 7.8–10.6°C in 90s indicates the SoC is under genuine sustained load. No throttling triggered (max temp around 56°C; throttle threshold ~85°C).
Cell A IQR-based pass/fail thresholds (Phase 1 lock candidate)
A package P delivers measurable benefit if the P-off cell at N=3 exceeds ALL of the following deltas relative to the cell A median:
| metric | cell A median | threshold delta (≈ N=3 range) | “measurable benefit if P-off exceeds” |
|---|---|---|---|
| drops_60s | 15 | +5 | ≥ 21 |
| effective_fps | 24.0 | -0.05 | ≤ 23.95 |
| kwin_cpu_median | 12.0 | +0.5 | ≥ 12.5 (ranges hit 0.0; 0.5 is conservative floor) |
| browser_cpu_median | 56.0 | +1 | ≥ 57.0 |
| panfrost_mean_freq_mhz | 595 | +10 | ≥ 605 |
The thresholds are deliberately conservative on metrics where cell A IQR was 0 (kwin_cpu_median, panfrost_peak_freq_pct), inflated to 0.5–1 unit to avoid spurious-significance on rounding artefacts. A package needs to hit ANY ONE of the threshold deltas to count as having measurable benefit; a package that hits NONE is “no measurable benefit on this matrix” — explicit chaff.
Phase 0 task #54 verdict
Three reps acquired in-session, very tight on CPU/freq metrics, moderate variance on drops (consistent with Phase 3-prime predecessor experience that drops are the noisiest signal). Cell A baseline locked. Phase 1 binding-cell thresholds drafted above; lock when reading them back to operator.
Phase 0 task #54 = COMPLETE.
phase4_findings.md — cross-cell analysis (uncurated)
fourier_attribution — Cross-cell analysis 2026-05-03
Matrix executed in full: 4 cells × N=3 reps = 12 reps acquired in-session on ohm (PineTab2 RK3568, kernel 6.19.10-danctnix1-1-pinetab2, mesa 26.0.5, governor=performance, baloo off, daily Brave killed for clean ambient, autologin via campaign-temporary 99-autologin-fourier-attribution.conf since reverted).
Per-rep summary.txt and start.txt mirrored locally to phase4_evidence/{a,b,c,d}_rep[1-3]_*.
Per-rep table (raw)
| A1 | A2 | A3 | B1 | B2 | B3 | C1 | C2 | C3 | D1 | D2 | D3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| drops_5s | 10 | 5 | 5 | 15 | 8 | 11 | 11 | 7 | 5 | 9 | 10 | 16 |
| drops_60s | 17 | 12 | 8 | 16 | 11 | 18 | 14 | 10 | 5 | 21 | 24 | 25 |
| drops_post_5s | 7 | 7 | 3 | 1 | 3 | 7 | 3 | 3 | 0 | 12 | 14 | 9 |
| effective_fps | 24.01 | 24.01 | 24.00 | 23.43 | 23.18 | 22.91 | 24.0 | 24.01 | 23.98 | 23.99 | 23.99 | 23.99 |
| kwin_cpu_median | 11.0 | 12.0 | 11.0 | 12.9 | 13.0 | 13.0 | 12.0 | 11.0 | 11.0 | 32.9 | 32.9 | 32.9 |
| browser_cpu_median | 54.25 | 54.4 | 54.9 | 137.3 | 140.65 | 137.15 | 60.2 | 54.15 | 52.2 | 64.25 | 63.3 | 64.05 |
| panfrost_mean_freq_mhz | 596.4 | 607.2 | 606.8 | 596.2 | 605.7 | 605.4 | 596.7 | 590.7 | 595.6 | 783.2 | 777.2 | 774.5 |
| panfrost_peak_freq_pct | 35.1 | 36.2 | 35.0 | 35.9 | 36.5 | 36.9 | 35.4 | 34.0 | 33.5 | 95.2 | 93.7 | 93.3 |
| therm_drift_c | 11.1 | 11.7 | 7.7 | 12.2 | 12.8 | 8.6 | 12.7 | 5.0 | 11.7 | 17.2 | 7.9 | 7.2 |
Cell medians
| metric | A (baseline) | B (chromium-fourier off, Brave) | C (qt6-fourier off) | D (kwin-fourier off) |
|---|---|---|---|---|
| drops_60s | 12 | 16 | 10 | 24 |
| effective_fps | 24.01 | 23.18 | 24.0 | 23.99 |
| kwin_cpu_median | 11.0 | 13.0 | 11.0 | 32.9 |
| browser_cpu_median | 54.4 | 137.15 | 54.15 | 64.05 |
| panfrost_mean_freq_mhz | 606.8 | 605.4 | 595.6 | 777.2 |
| panfrost_peak_freq_pct | 35.1 | 36.5 | 34.0 | 93.7 |
Per-cell verdict (against the IQR-based thresholds locked in `baseline_a_2026-05-03.md`)
Threshold rule: P-off cell delivers measurable benefit if the off-cell exceeds cell A median by:
- drops_60s ±5 | effective_fps ±0.05 | kwin_cpu_median ±0.5 | browser_cpu_median ±1 | panfrost_mean_freq_mhz ±10
Cell B — chromium-fourier OFF (→ Brave 1.89)
| metric | Δ vs A | beyond threshold? |
|---|---|---|
| drops_60s | +4 | NO (within ±5) |
| effective_fps | −0.83 | YES — Brave is significantly slower |
| kwin_cpu_median | +2.0 | YES |
| browser_cpu_median | +82.75 | YES (massive — 2.5×) |
| panfrost_mean_freq_mhz | −1.4 | NO |
Verdict: chromium-fourier delivers measurable benefit. 3 of 5 metrics fail when off.
Caveat: Cell B confounds *chromium-fourier patches* with *Brave-vs-Chromium-version* differences (Brave 1.89.145 = Chromium 147 base; chromium-fourier = Chromium 149 base). The +83pp browser CPU is plausibly part version-bump and part patch-engagement (chromium-fourier Step 1 + Step 2). Not separable in this matrix; would need stock-Chromium-149 build as additional cell.
Cell C — qt6-fourier OFF
| metric | Δ vs A | beyond threshold? |
|---|---|---|
| drops_60s | −2 | NO (and *better* by 2 — within IQR) |
| effective_fps | −0.01 | NO |
| kwin_cpu_median | 0 | NO |
| browser_cpu_median | −0.25 | NO |
| panfrost_mean_freq_mhz | −11.2 | marginal wrong direction (off is *lower* GPU) |
Verdict: qt6-fourier delivers NO measurable benefit on this workload. Zero of 5 metrics fail when off. The single marginal panfrost delta is in the wrong direction (less GPU work without qt6-fourier), suggesting qt6-fourier may even have a tiny GPU cost on this workload — though the magnitude is at the IQR edge.
Caveat: qt6-fourier patches qt6-base for the GL_ALPHA stall. That stall may not be triggered by 1080p H.264 NV12 video playback in chromium-fourier; the patch could still help other workloads (different video formats, mixed application loads, etc.) not in scope here.
Cell D — kwin-fourier OFF
| metric | Δ vs A | beyond threshold? |
|---|---|---|
| drops_60s | +12 | YES (>2× threshold) |
| effective_fps | −0.02 | NO |
| kwin_cpu_median | +21.9 | YES (massive — 3×) |
| browser_cpu_median | +9.65 | YES |
| panfrost_mean_freq_mhz | +170.4 | YES (massive — Mali jumps from ~600 to ~775 MHz) |
| panfrost_peak_freq_pct | +58.7 | (would be massive YES if it were a primary threshold) |
Verdict: kwin-fourier delivers MASSIVE measurable benefit. 4 of 5 primary metrics fail when off, with the kwin-CPU and panfrost-freq deltas being huge.
This is consistent with the predecessor's diagnosis that kwin-fourier patches a watchDmaBuf fence-wait issue. Without the patch, KWin spins waiting on dmabuf fences that should already be signaled, doing extra GL composite work, driving Mali to near-max all the time. The packaged fix is doing real work on this workload.
Wheat-vs-chaff verdict
WHEAT (measurable benefit on bbb 1080p H.264 60s playback, ohm/PineTab2):
- kwin-fourier — load-bearing. Removing it triples kwin CPU, drives Mali GPU to 95% peak-freq residency, doubles drops_60s. Likely the single most impactful fourier package on this hardware/workload.
- chromium-fourier — measurable benefit, magnitude inflated by Brave-vs-Chromium-149 version confound. Real signal exists (3 of 5 metrics fail when swapped to Brave); cleanest separation would need a stock-Chromium-149 control cell, out of scope here.
CHAFF (no measurable benefit on this specific workload, ohm/PineTab2):
- qt6-fourier — zero of 5 metrics moved beyond cell A IQR when removed. Cell C medians are within rep-to-rep noise of cell A. The GL_ALPHA stall the patch addresses doesn't trigger in this scenario. *Not a statement about other workloads*; just chaff for this specific binding-cell set.
Open questions raised
- Brave-vs-Chromium-149 confound in cell B. Could be resolved with a stock Chromium 149 build as a fifth cell. Out of campaign scope.
- qt6-fourier on other workloads. This campaign rules out benefit for bbb 1080p H.264 60s in chromium-fourier; doesn't say anything about Firefox, mpv, mixed Plasma desktop activity, or different video formats. If the user wants qt6-fourier maintained, it'd be on the merits of a different workload set.
- Cell C had wider rep-to-rep variance than expected (drops_60s 5/10/14, browser_cpu_median 52.2/54.15/60.2). May be ambient drift in the new session post-package-swap; not enough to change the verdict but flagged for honesty.
- All-off cell. Not measured. Combined effect of kwin+qt6+chromium being all reverted not in matrix; predecessor evidence implies it'd be ≈cell D (kwin dominant) but unknown.
Phase 8 memory hooks worth carrying forward
- *Single-toggle attribution matrix on small N=3 with clean ambient is enough to rank fourier-flavoured packages by impact on a specific workload* — this campaign in 4 hours gave answers the original fourier campaign hadn't separated in months of “fix this fix that”.
- *kwin-fourier `watchDmaBuf` fix is the load-bearing one for video on rockchip* (RK3568). Removing it is the failure mode. Should not be removed casually.
- *qt6-fourier GL_ALPHA fix may not bind on workloads that don't trigger it.* Don't ship it as critical without identifying the workloads where it actually helps.
- *Brave-vs-Chromium version confound in matrix design* — when “package off” requires using a different upstream binary, document the version skew explicitly and note it as a confound; don't pretend the swap is clean.
phase4_evidence — per-rep summary.txt (raw, uncurated)
a_rep1_summary.txt
REP_ID=a_rep1 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=10 drops_60s=17 drops_post_5s=7 frames_5s=124 frames_60s=1448 effective_fps=24.01 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=11.0 kwin_cpu_mean=11.65 kwin_cpu_min=9.9 kwin_cpu_max=19.0 kwin_cpu_p25=11.0 kwin_cpu_p75=12.0 kwin_cpu_iqr=1.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=54.25 browser_cpu_mean=58.59 browser_cpu_min=49.1 browser_cpu_max=115.7 browser_cpu_p25=52.2 browser_cpu_p75=59.1 browser_cpu_iqr=6.9 --- panfrost devfreq (window) --- panfrost_total_window_ms=74681 panfrost_mean_freq_mhz=596.4 panfrost_peak_freq_pct=35.1 panfrost_residency_200MHz_ms=1356 panfrost_residency_300MHz_ms=485 panfrost_residency_400MHz_ms=25080 panfrost_residency_600MHz_ms=19625 panfrost_residency_700MHz_ms=1897 panfrost_residency_800MHz_ms=26238 --- thermal --- therm_pre_milliC=46111 therm_post_milliC=57222 therm_drift_c=11.1
a_rep2_summary.txt
REP_ID=a_rep2 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=5 drops_60s=12 drops_post_5s=7 frames_5s=124 frames_60s=1447 effective_fps=24.01 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=12.0 kwin_cpu_mean=12.16 kwin_cpu_min=10.0 kwin_cpu_max=17.0 kwin_cpu_p25=11.0 kwin_cpu_p75=13.0 kwin_cpu_iqr=2.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=54.4 browser_cpu_mean=59.76 browser_cpu_min=49.4 browser_cpu_max=115.7 browser_cpu_p25=53.0 browser_cpu_p75=60.0 browser_cpu_iqr=7.0 --- panfrost devfreq (window) --- panfrost_total_window_ms=73875 panfrost_mean_freq_mhz=607.2 panfrost_peak_freq_pct=36.2 panfrost_residency_200MHz_ms=1294 panfrost_residency_300MHz_ms=207 panfrost_residency_400MHz_ms=22356 panfrost_residency_600MHz_ms=20947 panfrost_residency_700MHz_ms=2314 panfrost_residency_800MHz_ms=26757 --- thermal --- therm_pre_milliC=47777 therm_post_milliC=59444 therm_drift_c=11.7
a_rep3_summary.txt
REP_ID=a_rep3 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=5 drops_60s=8 drops_post_5s=3 frames_5s=125 frames_60s=1448 effective_fps=24.0 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=11.0 kwin_cpu_mean=11.51 kwin_cpu_min=10.0 kwin_cpu_max=17.0 kwin_cpu_p25=11.0 kwin_cpu_p75=12.0 kwin_cpu_iqr=1.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=54.9 browser_cpu_mean=58.36 browser_cpu_min=49.3 browser_cpu_max=124.8 browser_cpu_p25=52.4 browser_cpu_p75=57.3 browser_cpu_iqr=4.9 --- panfrost devfreq (window) --- panfrost_total_window_ms=74432 panfrost_mean_freq_mhz=606.8 panfrost_peak_freq_pct=35.9 panfrost_residency_200MHz_ms=1302 panfrost_residency_300MHz_ms=309 panfrost_residency_400MHz_ms=21911 panfrost_residency_600MHz_ms=22616 panfrost_residency_700MHz_ms=1545 panfrost_residency_800MHz_ms=26749 --- thermal --- therm_pre_milliC=50625 therm_post_milliC=58333 therm_drift_c=7.7
b_rep1_summary.txt
REP_ID=b_rep1 KIND=brave-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=15 drops_60s=16 drops_post_5s=1 frames_5s=123 frames_60s=1414 effective_fps=23.43 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=12.9 kwin_cpu_mean=12.5 kwin_cpu_min=8.9 kwin_cpu_max=16.0 kwin_cpu_p25=11.9 kwin_cpu_p75=13.0 kwin_cpu_iqr=1.1 --- brave aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=137.3 browser_cpu_mean=151.89 browser_cpu_min=92.3 browser_cpu_max=240.9 browser_cpu_p25=125.2 browser_cpu_p75=183.1 browser_cpu_iqr=57.9 --- panfrost devfreq (window) --- panfrost_total_window_ms=73944 panfrost_mean_freq_mhz=596.2 panfrost_peak_freq_pct=35.9 panfrost_residency_200MHz_ms=2321 panfrost_residency_300MHz_ms=1196 panfrost_residency_400MHz_ms=22807 panfrost_residency_600MHz_ms=18447 panfrost_residency_700MHz_ms=2649 panfrost_residency_800MHz_ms=26524 --- thermal --- therm_pre_milliC=48333 therm_post_milliC=60555 therm_drift_c=12.2
b_rep2_summary.txt
REP_ID=b_rep2 KIND=brave-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=8 drops_60s=11 drops_post_5s=3 frames_5s=115 frames_60s=1392 effective_fps=23.18 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=13.0 kwin_cpu_mean=12.99 kwin_cpu_min=9.0 kwin_cpu_max=16.0 kwin_cpu_p25=12.0 kwin_cpu_p75=14.0 kwin_cpu_iqr=2.0 --- brave aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=140.65 browser_cpu_mean=151.2 browser_cpu_min=93.0 browser_cpu_max=258.8 browser_cpu_p25=124.6 browser_cpu_p75=176.4 browser_cpu_iqr=51.8 --- panfrost devfreq (window) --- panfrost_total_window_ms=75308 panfrost_mean_freq_mhz=605.7 panfrost_peak_freq_pct=36.5 panfrost_residency_200MHz_ms=2427 panfrost_residency_300MHz_ms=1224 panfrost_residency_400MHz_ms=20508 panfrost_residency_600MHz_ms=19969 panfrost_residency_700MHz_ms=3684 panfrost_residency_800MHz_ms=27496 --- thermal --- therm_pre_milliC=48888 therm_post_milliC=61666 therm_drift_c=12.8
b_rep3_summary.txt
REP_ID=b_rep3 KIND=brave-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=11 drops_60s=18 drops_post_5s=7 frames_5s=104 frames_60s=1367 effective_fps=22.91 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=13.0 kwin_cpu_mean=12.96 kwin_cpu_min=9.0 kwin_cpu_max=19.9 kwin_cpu_p25=12.0 kwin_cpu_p75=14.0 kwin_cpu_iqr=2.0 --- brave aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=137.15 browser_cpu_mean=155.17 browser_cpu_min=98.8 browser_cpu_max=286.0 browser_cpu_p25=127.5 browser_cpu_p75=177.7 browser_cpu_iqr=50.2 --- panfrost devfreq (window) --- panfrost_total_window_ms=75129 panfrost_mean_freq_mhz=605.4 panfrost_peak_freq_pct=36.9 panfrost_residency_200MHz_ms=3629 panfrost_residency_300MHz_ms=1270 panfrost_residency_400MHz_ms=18742 panfrost_residency_600MHz_ms=19345 panfrost_residency_700MHz_ms=4434 panfrost_residency_800MHz_ms=27709 --- thermal --- therm_pre_milliC=52500 therm_post_milliC=61111 therm_drift_c=8.6
c_rep1_summary.txt
REP_ID=c_rep1 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=11 drops_60s=14 drops_post_5s=3 frames_5s=125 frames_60s=1449 effective_fps=24.0 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=12.0 kwin_cpu_mean=11.76 kwin_cpu_min=10.0 kwin_cpu_max=16.0 kwin_cpu_p25=11.0 kwin_cpu_p75=12.0 kwin_cpu_iqr=1.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=60.2 browser_cpu_mean=62.12 browser_cpu_min=49.8 browser_cpu_max=115.1 browser_cpu_p25=57.6 browser_cpu_p75=63.3 browser_cpu_iqr=5.7 --- panfrost devfreq (window) --- panfrost_total_window_ms=76199 panfrost_mean_freq_mhz=596.7 panfrost_peak_freq_pct=35.4 panfrost_residency_200MHz_ms=2551 panfrost_residency_300MHz_ms=1297 panfrost_residency_400MHz_ms=22910 panfrost_residency_600MHz_ms=19041 panfrost_residency_700MHz_ms=3415 panfrost_residency_800MHz_ms=26985 --- thermal --- therm_pre_milliC=50625 therm_post_milliC=63333 therm_drift_c=12.7
c_rep2_summary.txt
REP_ID=c_rep2 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=7 drops_60s=10 drops_post_5s=3 frames_5s=125 frames_60s=1449 effective_fps=24.01 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=11.0 kwin_cpu_mean=11.24 kwin_cpu_min=9.0 kwin_cpu_max=16.0 kwin_cpu_p25=10.9 kwin_cpu_p75=12.0 kwin_cpu_iqr=1.1 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=54.15 browser_cpu_mean=58.84 browser_cpu_min=49.3 browser_cpu_max=184.9 browser_cpu_p25=52.3 browser_cpu_p75=57.7 browser_cpu_iqr=5.4 --- panfrost devfreq (window) --- panfrost_total_window_ms=74799 panfrost_mean_freq_mhz=590.7 panfrost_peak_freq_pct=34.0 panfrost_residency_200MHz_ms=1860 panfrost_residency_300MHz_ms=416 panfrost_residency_400MHz_ms=25766 panfrost_residency_600MHz_ms=18917 panfrost_residency_700MHz_ms=2379 panfrost_residency_800MHz_ms=25461 --- thermal --- therm_pre_milliC=58333 therm_post_milliC=63333 therm_drift_c=5.0
c_rep3_summary.txt
REP_ID=c_rep3 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=5 drops_60s=5 drops_post_5s=0 frames_5s=125 frames_60s=1447 effective_fps=23.98 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=11.0 kwin_cpu_mean=10.96 kwin_cpu_min=9.0 kwin_cpu_max=15.0 kwin_cpu_p25=11.0 kwin_cpu_p75=11.0 kwin_cpu_iqr=0.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=52.2 browser_cpu_mean=54.9 browser_cpu_min=46.7 browser_cpu_max=135.9 browser_cpu_p25=51.1 browser_cpu_p75=55.2 browser_cpu_iqr=4.1 --- panfrost devfreq (window) --- panfrost_total_window_ms=73269 panfrost_mean_freq_mhz=595.6 panfrost_peak_freq_pct=33.5 panfrost_residency_200MHz_ms=1681 panfrost_residency_300MHz_ms=311 panfrost_residency_400MHz_ms=23541 panfrost_residency_600MHz_ms=20735 panfrost_residency_700MHz_ms=2468 panfrost_residency_800MHz_ms=24533 --- thermal --- therm_pre_milliC=50000 therm_post_milliC=61666 therm_drift_c=11.7
d_rep1_summary.txt
REP_ID=d_rep1 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=9 drops_60s=21 drops_post_5s=12 frames_5s=124 frames_60s=1448 effective_fps=23.99 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=32.9 kwin_cpu_mean=33.44 kwin_cpu_min=30.8 kwin_cpu_max=39.9 kwin_cpu_p25=32.8 kwin_cpu_p75=33.9 kwin_cpu_iqr=1.1 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=64.25 browser_cpu_mean=68.74 browser_cpu_min=57.0 browser_cpu_max=107.9 browser_cpu_p25=61.2 browser_cpu_p75=71.6 browser_cpu_iqr=10.4 --- panfrost devfreq (window) --- panfrost_total_window_ms=75111 panfrost_mean_freq_mhz=783.2 panfrost_peak_freq_pct=95.2 panfrost_residency_200MHz_ms=1369 panfrost_residency_300MHz_ms=103 panfrost_residency_400MHz_ms=208 panfrost_residency_600MHz_ms=1134 panfrost_residency_700MHz_ms=780 panfrost_residency_800MHz_ms=71517 --- thermal --- therm_pre_milliC=45555 therm_post_milliC=62777 therm_drift_c=17.2
d_rep2_summary.txt
REP_ID=d_rep2 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=10 drops_60s=24 drops_post_5s=14 frames_5s=124 frames_60s=1448 effective_fps=23.99 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=32.9 kwin_cpu_mean=33.14 kwin_cpu_min=29.7 kwin_cpu_max=38.8 kwin_cpu_p25=31.9 kwin_cpu_p75=33.9 kwin_cpu_iqr=2.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=63.3 browser_cpu_mean=68.91 browser_cpu_min=56.2 browser_cpu_max=127.7 browser_cpu_p25=61.2 browser_cpu_p75=67.6 browser_cpu_iqr=6.4 --- panfrost devfreq (window) --- panfrost_total_window_ms=79234 panfrost_mean_freq_mhz=777.2 panfrost_peak_freq_pct=93.7 panfrost_residency_200MHz_ms=161 panfrost_residency_300MHz_ms=105 panfrost_residency_400MHz_ms=3715 panfrost_residency_600MHz_ms=727 panfrost_residency_700MHz_ms=261 panfrost_residency_800MHz_ms=74265 --- thermal --- therm_pre_milliC=53750 therm_post_milliC=61666 therm_drift_c=7.9
d_rep3_summary.txt
REP_ID=d_rep3 KIND=chromium-fourier-kwin end_seen=True trajectory_samples=71 --- drops --- drops_5s=16 drops_60s=25 drops_post_5s=9 frames_5s=124 frames_60s=1447 effective_fps=23.99 --- kwin_wayland CPU%% (5s-65s) --- kwin_cpu_n=60 kwin_cpu_median=32.9 kwin_cpu_mean=33.03 kwin_cpu_min=27.9 kwin_cpu_max=40.2 kwin_cpu_p25=31.9 kwin_cpu_p75=33.9 kwin_cpu_iqr=2.0 --- chrome aggregate CPU%% (5s-65s) --- browser_cpu_n=60 browser_cpu_median=64.05 browser_cpu_mean=70.36 browser_cpu_min=58.0 browser_cpu_max=148.0 browser_cpu_p25=62.1 browser_cpu_p75=70.9 browser_cpu_iqr=8.8 --- panfrost devfreq (window) --- panfrost_total_window_ms=78666 panfrost_mean_freq_mhz=774.5 panfrost_peak_freq_pct=93.3 panfrost_residency_200MHz_ms=326 panfrost_residency_300MHz_ms=463 panfrost_residency_400MHz_ms=3576 panfrost_residency_600MHz_ms=621 panfrost_residency_700MHz_ms=262 panfrost_residency_800MHz_ms=73418 --- thermal --- therm_pre_milliC=56111 therm_post_milliC=63333 therm_drift_c=7.2
Representative start.txt — package state per cell at rep time
a_rep1_start.txt
2026-05-03T21:09:44Z REP_ID=a_rep1 KIND=chromium-fourier-kwin BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome Chromium 149.0.7812.0 qt6-base-fourier 1:6.11.0-3 qt6-base-fourier 1:6.11.0-3 kwin-fourier 1:6.6.4-3 kwin-fourier 1:6.6.4-3 6.19.10-danctnix1-1-pinetab2 workload_pid=67340 workload_pgid=67340 script_pgid=67329 AUTOPLAY at 2026-05-03T21:09:49Z (took 5 s) kwin_wayland_pid=53655
b_rep1_start.txt
2026-05-03T21:14:21Z REP_ID=b_rep1 KIND=brave-kwin BROWSER_BIN=/usr/bin/brave Brave Browser 147.1.89.145 qt6-base-fourier 1:6.11.0-3 qt6-base-fourier 1:6.11.0-3 kwin-fourier 1:6.6.4-3 kwin-fourier 1:6.6.4-3 6.19.10-danctnix1-1-pinetab2 workload_pid=69443 workload_pgid=69443 script_pgid=69432 AUTOPLAY at 2026-05-03T21:14:26Z (took 5 s) kwin_wayland_pid=53655
c_rep1_start.txt
2026-05-03T22:01:37Z REP_ID=c_rep1 KIND=chromium-fourier-kwin BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome Chromium 149.0.7812.0 qt6-base 6.11.0-2 kwin-fourier 1:6.6.4-3 kwin-fourier 1:6.6.4-3 6.19.10-danctnix1-1-pinetab2 workload_pid=82947 workload_pgid=82947 script_pgid=82936 AUTOPLAY at 2026-05-03T22:01:43Z (took 6 s) kwin_wayland_pid=82420
d_rep1_start.txt
2026-05-03T21:50:32Z REP_ID=d_rep1 KIND=chromium-fourier-kwin BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome Chromium 149.0.7812.0 qt6-base-fourier 1:6.11.0-3 qt6-base-fourier 1:6.11.0-3 kwin 6.6.4-1 6.19.10-danctnix1-1-pinetab2 workload_pid=79834 workload_pgid=79834 script_pgid=79823 AUTOPLAY at 2026-05-03T21:50:37Z (took 5 s) kwin_wayland_pid=77581
Reviewer brief
Per Phase 5 of the dev process: Claude is forbidden from curating this artifact. The reviewer is asked to read the raw documents above and surface anything the active session may have rationalised away — methodology gaps, threshold-setting bias, confounds the campaign acknowledged but didn't quantify (e.g. Brave-vs-Chromium-149 version delta in cell B), and any path from this matrix to a useful next campaign.
Phase 5 reviewer response (Sonnet architect, 2026-05-04)
Engaged via Plan subagent, model: sonnet override, open-consultation mode. Active-session prompt forbade curating the artifact; reviewer was explicitly asked to argue with the verdicts.
Phase 5 Review — fourier_attribution 2026-05-03
Reviewer: Sonnet architect (subagent, fresh-model invocation 2026-05-04, open-consultation mode).
Engaged via Plan subagent with model: sonnet override per the kwin_overlay_subsurface precedent. Review prompt explicitly forbade the active session from curating the artifact going to the reviewer; reviewer was given paths to local repo + DokuWiki page + asked to argue with the verdicts.
§1 Methodology concerns
The cell ordering and baseline recycling are the biggest structural problem.
Cell A (all-fourier-on baseline) was run in the phase0 evidence set (reps starting ~20:57Z), then those reps were discarded in favour of a *second* N=3 cell-A run in the phase4 evidence set (reps starting ~21:09Z, i.e. ~12 minutes later, same kwin PID 53655). The phase4 cell A has measurably different numbers from the phase0 cell A: drops_60s fell from {20,15,15} to {17,12,8}, and browser_cpu_median fell from {56.6,56.25,56.0} to {54.25,54.4,54.9}. Both moves are in the “looks better” direction. Thresholds were locked against the *phase0* cell A medians and ranges (drops_60s median=15, browser_cpu_median=56.0), but the phase4 cell A values used as the comparison baseline in phase4_findings.md are the *lower* set (median drops_60s=12, browser_cpu_median=54.4). The analysis then computes deltas against “A median = 12 drops / 54.4 browser CPU” while the pass/fail thresholds were set against a baseline of “15 drops / 56.0 browser CPU.” This is not called out anywhere in phase4_findings.md, and the arithmetic is self-inconsistent as a result.
This matters concretely for cell B: the “drops_60s = +4 vs A” is computed as B_median(16) − A_phase4_median(12) = +4. But the threshold was set as “phase0 median + 5 = 20.” If the threshold anchor had been re-locked at the phase4 cell A, B's drops_60s delta would still be +4 (which is fine, still below 5). For the fps verdict the delta -0.83 is large either way. So this inconsistency doesn't change the cell-B verdict in this case — but it should have been called out, and it opens the door to the question of which cell A is “the” baseline.
The metric set has a blind spot: no per-frame GPU submit latency or fence wait time. The active session noted that wp_presentation_feedback would yield Δ_present; more directly useful would be a simple count of how many top -d 1 kwin samples are above, say, 20% CPU (i.e. a “high-load fraction” rather than just median). The median of 32.9% in cell D is so high it doesn't need better instrumentation — but for borderline cases like cell B's kwin_cpu_median of 13.0% vs threshold of 12.5%, per-frame feedback would distinguish “systematically higher” from “a few high spikes pulling up the median.” The campaign correctly excludes wp_presentation_feedback as out-of-scope, but it should have flagged that the kwin_cpu_median in cell B is at the threshold boundary and would benefit from more resolution.
No cell E (all-fourier-off). The campaign acknowledges this. The combination matters because if kwin-fourier and chromium-fourier both being off produces a qualitatively different failure mode than kwin-fourier alone being off, the single-toggle matrix won't surface it. For practical maintenance decisions (can we drop all three packages?) this is the directly relevant cell and it's missing.
Execution order: A→B→D→C. The full sequence was phase0-A (20:57Z), phase4-A (21:09Z), B (21:14Z), D (21:50Z), C (22:01Z). Cell D ran ~37 minutes after cell B in the same continuous uptime session (same kwin PID 53655 was still alive through D). Cell C ran on a new session (kwin PID 82420, different from 53655 and 77581). The campaign notes this in phase4_findings “open questions” as potential ambient drift in cell C, but it doesn't ask the symmetric question about whether the A/B/D sequence on a continuous session introduced a drift pattern into those three cells that wouldn't apply to cell C.
§2 Threshold concerns
The kwin_cpu_median threshold bump from 0 to 0.5 is defensible as a decision but not as stated. The phase0 cell A had kwin_cpu_median = 12.0 in all three reps (range 0.0). The justification for inflating the threshold to ±0.5 is that top -d 1 rounds to integer percent. That is correct as far as it goes, but the implication is wrong: if top rounds to integers, then any non-zero difference between cells will be at least 1.0, not 0.5. Setting the threshold at 12.5 is operationally equivalent to setting it at 12.99 — the only values top will return are 12.0 or 13.0, never 12.5. So the 0.5 threshold is conservative in the right direction (avoids false positives) but its stated rationale (“floor because top rounds”) is imprecise. It effectively means “any cell with median ≥ 13.0 triggers the threshold.” Cell B's median is 13.0, which is exactly at the trigger. If the threshold had been stated correctly as “rounds-to-1%” it would have been ±1.0, and cell B's kwin_cpu delta of +2.0 would still trigger but you'd be clearer about what you're measuring.
The cell-A median used in phase4_findings.md is from the phase4 A reps, but the IQR thresholds were locked from the phase0 A reps. This means the anchor point (what “A median” means) is different in the threshold table vs the delta computation table. Specifically: drops_60s threshold was “≥21 to trigger” (phase0 median 15 + delta 5 = 20; but the wording says ≥21). Cell D's drops_60s median is 24, which clears it either way. Cell B's drops_60s delta of +4 is computed as B(16) − A_phase4(12), giving +4, which is below the ±5 threshold. But B(16) − A_phase0(15) = +1, even more clearly below threshold. So this inconsistency doesn't change any verdict, but the pass/fail table in phase4_findings.md cites “+4 = within ±5” while the drops floor it's compared against is the phase4 median (12), not the threshold anchor (15). The documentation is subtly self-contradictory and would mislead anyone trying to verify the math.
panfrost_mean_freq threshold of ±10 MHz is derived from the phase0 range of 8.9 MHz, rounded up to 10. Reasonable. The phase4 cell A range is 607.2 − 596.4 = 10.8 MHz, slightly wider than the phase0 range. If the phase4 range had been used as the threshold, it would have rounded to ±11 MHz, and cell C's −11.2 delta would still be exactly at or below threshold (rather than “marginal wrong direction”). This has no verdict impact since cell C is chaff by all other metrics, but it illustrates the phase0-vs-phase4 baseline confusion again.
§3 Confounds the campaign missed
The “daily Brave ambient” was stable across A and B but vanished for D and C. The phase0 state snapshot notes “operator's daily Brave (PID 58105 etc.) running in background, ~13% CPU idle. Documented as stable ambient confound across all three [phase0 A] reps.” In phase4_findings.md, the README says “daily Brave killed for clean ambient” — so by the time the phase4 matrix was run, the ambient Brave was gone. This means the phase4 A reps are running in a cleaner ambient than the phase0 A reps, which is probably why browser_cpu_median dropped from ~56 to ~54 (the background Brave is no longer consuming CPU that could have been attributed to the workload browser's process group, depending on how aggregation works). This is explicitly tracked in phase0 as a “stable confound” and it was apparently resolved before phase4 — but the two A-rep runs are never reconciled or explained. Why was cell A run twice?
Session age asymmetry within cell D. This is the one the active session noticed. But it didn't note the directional implication: cell D reps started at 21:50Z — roughly 47 minutes after the first phase4 cell A rep (21:09Z). The kwin_wayland PID (77581) for cell D is a fresh post-revert session. If a fresh KWin session has a warm-up cost (dbus listeners registering, initial frame pipeline setup, shader compilation) that settles within a few minutes, and all three D reps showed near-identical kwin CPU of ~32.9%, then this is a steady state, not a fresh-session artifact. The signal is too large and too stable across reps to be a warm-up artifact. But the campaign missed the inverse question: are cells A and B *benefiting from an older, warmed-up compositor state* (kwin PID 53655 had been running since before the phase0 runs, which started at ~20:57Z, i.e. kwin was at least ~35 minutes old before any data was collected)? An older KWin session will have shader caches warm, GL state cached, etc. Cell D's fresh KWin might genuinely be doing more work because it hasn't compiled/cached what kwin-fourier-patched KWin had already compiled by the time cell A ran. This is a concrete alternative explanation for part of the cell D vs A delta that is not the kwin-fourier patch itself.
However, the shader-cache explanation for cell D's 32.9% vs 11% kwin CPU is implausible: shader compilation is a one-time cost per boot/profile, not a 60-second continuous burn. The sustained 95% GPU peak-freq residency in cell D is not consistent with shader compilation overhead; it's consistent with the watchDmaBuf spin diagnosis. So this confound is unlikely to explain cell D's result, but it should have been named and dismissed explicitly.
cell D ran before cell C, meaning when the kwin revert happened, qt6-base was still the fourier version. Cell C ran with stock qt6-base but fourier kwin. The sequence was: all-on → chromium-off (B) → kwin-off (D) → qt6-off (C). After cell C finished, both kwin and qt6 were presumably reverted (or cell C reverted qt6, and… what happened to kwin for cell C?). The c_rep1_start.txt shows kwin-fourier 1:6.6.4-3 is present, which means between cell D (kwin-off) and cell C (qt6-off), kwin-fourier was reinstalled. The revert log only shows kwin revert, no kwin reinstall log. This means there was a reinstall step between D and C that isn't in evidence. That's not a confound per se — the start.txt confirms packages were verified per-rep — but the session order (D before C) means cell C ran on a KWin session that was created *after* the kwin-fourier reinstall + logout+login. The phase0 notes describe autologin via 99-autologin-fourier-attribution.conf — and the kwin PID changes confirm each logout+login created a fresh session. So cell C is the *second* fresh session (after D), and cell A/B are on the *original long-uptime session*. This makes A/B vs C/D a confounded comparison on session age independent of the package toggle.
§4 Verdict robustness per package
kwin-fourier: WHEAT, robust.
The signal is 3× kwin CPU (11→33%), 170 MHz mean GPU freq jump (~28% of scale), 95% peak-freq residency (vs 35%), and 2× drops. All three cell D reps are near-identical — kwin_cpu_median is exactly 32.9% in all three, panfrost_mean_freq is 783/777/775. This is the tightest cell in the whole matrix and the largest delta. No plausible confound explains this away: the shader-cache alternative is dismissed above (it doesn't produce a 60-second continuous GPU burn at 95% peak residency). The session-age asymmetry (fresh D vs long-uptime A) would, if anything, help cell D by having a cleaner cache state, but it doesn't explain the sustained GPU saturation.
Confidence: very high. If the verdict here were wrong, you'd need to argue that kwin PID 77581's fresh session caused the GPU to continuously max-freq for the entire 60-second window for a reason unrelated to the kwin packages, which is not credible.
chromium-fourier: WHEAT-but-fragile verdict.
The claimed evidence is: fps −0.83 (Brave 22.91–23.43 vs chromium-fourier 24.0–24.01), browser_cpu_median +82.75pp (137 vs 54.4), kwin_cpu_median +2.0 (13.0 vs 11.0). All three are confounded by the Brave-vs-Chromium-149 version gap.
The fps delta is the clearest problem. Cell B fps values are {23.43, 23.18, 22.91} — a declining trend across the three reps (B1→B2→B3, each about 0.25 fps lower). Cell A's fps is locked at 24.00–24.01. This declining trend across Brave reps suggests something is drifting within the Brave session (progressive video decoder stall? buffer pressure?), not a stable property of the browser. The effective_fps metric in the extractor is computed as (frames_60s - frames_5s) / (s60[0] - s5[0]), and cell B's frames_60s are {1414, 1392, 1367} — genuinely delivering fewer frames, not just a measurement artifact. Whether this is a Chromium-147 decoder limitation, a Brave-specific regression vs Chromium-149, or a chromium-fourier patch effect cannot be separated.
The browser_cpu_median of 137pp vs 54.4pp is a 2.5× gap. This is real. Brave with Chromium-147 base is consuming 2.5× more CPU for the same workload. But Chromium-147 vs 149 is a two-major-version delta, which can easily explain multi-×100pp CPU differences in a decoder-heavy workload (codec path changes, VA-API usage patterns, zero-copy buffer handling). The chromium-fourier patches (Step 1 = libva-v4l2-request port, Step 2 = WaylandConnection overlay-route) are precisely the kind of changes that would reduce browser CPU by enabling hardware decode paths — but you cannot tell from this matrix whether those paths are also present in Brave-147, absent in Brave-147, or partially present with different efficiency.
The kwin_cpu_median of +2.0 (11→13%) is, as noted above, right at the rounding threshold. It's suggestive that Brave presents frames less efficiently to KWin, but at N=3 with integer-rounded values, it's barely more than a 1-sample wide signal.
My independent verdict: call it WHEAT-suspected-but-unconfirmed. The direction is clear, the magnitude is large, but the control comparison is the wrong browser at the wrong version. The campaign's own caveat is correct — you cannot call this “chromium-fourier delivers benefit” cleanly; you can only call it “chromium-fourier + chromium-149 base is substantially better than Brave-1.89/Chromium-147 on this workload.” The confound is load-bearing. I would not ship the chromium-fourier conclusion to anyone making a package maintenance decision without the Chromium-149 vanilla control.
qt6-fourier: CHAFF on this workload, verdict sound.
Zero of five metrics moved beyond threshold when qt6-base was reverted to stock. The panfrost mean freq delta is −11.2 MHz (slightly *lower* GPU usage without qt6-fourier), which is the wrong direction for “the patch helps.” Cell C reps have wider variance than cell A (drops_60s 5/10/14, browser_cpu 52.2/54.15/60.2), which the campaign correctly flags. However, c_rep1's browser_cpu_median is 60.2 — which is 5.8 above the cell A baseline of 54.4 — and the threshold is “+1.” If all three C reps had been like c_rep1, cell C would have been a false-positive wheat verdict on browser_cpu. The fact that c_rep2 (54.15) and c_rep3 (52.2) are both at or below baseline suppresses this. The variance in cell C is real and should be noted as a reliability concern for the qt6-fourier verdict, not just flagged as “wider than expected.” Had the campaign run N=5 for cell C, or had c_rep1's values been closer to the mean, the verdict might have been uncertain rather than confidently chaff.
The workload-specificity caveat is well-stated. “CHAFF on bbb 1080p H.264 Chromium-149” is correct. “CHAFF generally” is not supported.
§5 Cheapest next campaign
Run a Chromium-149 vanilla control cell (cell E) to de-confound the chromium-fourier verdict.
This is the single highest-value next step. The action is: obtain or build a stock Chromium-149 binary (without the Step 1 libva-v4l2-request port and without the Step 2 WaylandConnection overlay-route patches) and run it as cell E with all three fourier packages on. Compare against cell A (chromium-fourier on) and cell B (Brave-147).
Cost: the main effort is building or obtaining Chromium-149 vanilla for aarch64. The predecessor ohm_gl_fix campaign built chromium-fourier from source, so a same-version vanilla build is feasible — it's the same build without applying the fourier patches. If the build artifacts from the ohm_gl_fix campaign are still around, a stripped binary might be constructible faster. Alternatively, checking whether the Arch aarch64 Chromium package (not fourier-patched) is at version 149 would give a zero-effort control — if it's already 149 in the repo, pacman -S chromium may be sufficient.
If cell E shows vanilla-149 performs close to cell A (chromium-fourier), the verdict becomes “the benefit was version-level, not patch-level.” If cell E is close to cell B (Brave-147), the verdict strengthens to “patches matter, not version.” If cell E is somewhere in between, you have a partial attribution.
This needs only one additional cell at N=3, targeting exclusively the browser_cpu_median and fps metrics (kwin_cpu and GPU freq were not the primary indicators for cell B). The campaign infrastructure (orchestrator script, test rig) is already in place; the only new work is producing the binary.
Nothing else in the open-question list is cheaper or higher-signal for the stated campaign question.
