fourier_attribution — Phase 5 review (2026-05-03)
- README (campaign overview, uncurated)
fourier_attribution
Phase 0 — fourier_attribution
Phase 0 — cell A in-session baseline N=3 — 2026-05-03
fourier_attribution — Cross-cell analysis 2026-05-03
Phase 5 reviewer response (Sonnet architect, 2026-05-04)
Phase 5 Review — fourier_attribution 2026-05-03

fourier_attribution — Phase 5 review (2026-05-03)

Single-question matrix campaign. Operator: mfritsche. Reviewer: handover to fresh model instance per 8(+1)-phase loop Phase 5. Local artifacts: /home/mfritsche/src/fourier_attribution/ on noether.

Predecessors (substrate, NOT data): ohm_gl_fix (closed 2026-05-02), kwin_overlay_subsurface (closed 2026-05-03 without patch), x11-session-research (closed 2026-05-03 negative on both axes). This campaign acts on the replicate-baseline-first lesson — every binding cell number from in-session reps.

README (campaign overview, uncurated)

fourier_attribution

Single-question campaign on PineTab2 RK3568: for each of qt6-fourier, kwin-fourier, and chromium-fourier, is there a measurable, replicable benefit on bbb 1080p first-60s playback when that package is toggled off, vs the all-on baseline?

Spun off 2026-05-03 from the fourier umbrella, with strict scope discipline. The earlier fourier work (now dormant at ~/src/fourier/) digressed into fix-this-fix-that across browsers, mpv, libva, libplacebo, KWin, qt6, and kernel; this campaign separates wheat from chaff by attribution measurement under controlled package toggling.

In-scope

Hardware: ohm (PineTab2 — RK3568, Mali-G52 MP2, hantro G1/G2 VPU)
Workload: `bbb_1080p30_h264.mp4` first 70 seconds via `brave_drops_test.html`
Three Arch packages: `qt6-base-fourier`, `kwin-fourier`, `chromium-fourier`

(chromium-fourier in this campaign = the binary at /tmp/chromium-ohm-gl-fix-step2/chrome — ohm_gl_fix Step 1 + Step 2 patches)

Single Wayland session — Plasma 6.6.4 with KWin compositor

Out-of-scope (would have qualified before, scope-policed now)

Any other fourier-flavoured patches (firefox-fourier, kernel-vb2-dma-resv, libva-v4l2-request port)
Any patch authoring or upstream activity
X11 sessions / non-compositing WMs (closed in `x11-session-research`)
Δ_present-46ms side-finding (predecessor's hook; not the question here)
mpv, gst-launch, or any non-browser client
Other clips, other resolutions, other codecs

Predecessors (substrate, NOT data)

This is the fourth campaign in a chain on the same hardware target:

[`../ohm_gl_fix/`](../ohm_gl_fix/) — closed 2026-05-02. Diagnosed Step 1 (libva-v4l2-request port) and Step 2 (Chromium WaylandConnection overlay-route) issues; produced the `/tmp/chromium-ohm-gl-fix-step2/chrome` binary used here.
[`../kwin_overlay_subsurface/`](../kwin_overlay_subsurface/) — closed 2026-05-03 *without patch*. Premise (cage = 0 floor at N=1) didn't replicate at N=3. Lesson now codified in `feedback_replicate_baseline_first.md`.
[`../x11-session-research/`](../x11-session-research/) — closed 2026-05-03 *with negative result on both axes*. X11 + non-compositing WM is *worse* than Plasma Wayland on this hardware/workload; X server doesn't program Plane 39 with NV12 regardless of client.
This campaign (`fourier_attribution`) — opened 2026-05-03.

Per feedback_replicate_baseline_first.md: predecessor *measurement numbers* (drop counts, CPU%, freq) are reference history only. Every binding cell in this campaign anchors to in-session-acquired data.

Process

8(+1)-phase loop per ~/.claude/projects/-home-mfritsche-src/memory/feedback_dev_process.md. Phase 0 substrate captured 2026-05-03 across phase0_evidence/:

`state_2026-05-03.md` — package state matches predecessor carry-over claim; revert paths clean (same-upstream-version stock packages cached)
`test_rig_audit_2026-05-03.md` — predecessor test rig is intact; instrumentation reports per-second drops trajectory (sufficient for `drops_5s` / `drops_60s`); no Step 2 feature-flag opt-in needed (compiled in)
`devfreq_probe_2026-05-03.md` — panfrost devfreq at `/sys/class/devfreq/fde60000.gpu/`; trans_stat + cur_freq parseable; governor `simple_ondemand` (dynamic — workload-sensitive signal)

Phase 1 binding cells locked at:

`drops_5s` — drops in first 5s (warmup phase, file load + cache fill; reported but excluded from pass/fail)
`drops_60s` — drops over the 60s steady-state window (5s–65s)
`effective_fps` over 60s
`browser_cpu_pct` — top sampling 5s–65s
`kwin_cpu_pct` — same window
`panfrost_mean_freq_mhz` — trans_stat-diff weighted mean over 5s–65s
`panfrost_peak_freq_pct` — % of 60s spent at 800 MHz (Mali max)

Pass/fail threshold for “P delivers measurable benefit”: P-off cell increases drops_60s beyond all-on N=3 IQR, or any of {browser_cpu_pct, kwin_cpu_pct, panfrost_mean_freq_mhz} increases beyond its all-on N=3 IQR.

Note: this campaign uses 5s warmup boundary per operator instruction. The predecessor's phase3_prime_runs/post_process.sh uses 10s. The boundary divergence makes our drop counts not numerically comparable to predecessor numbers; that's intentional under campaign-contained data discipline.

Matrix

Cell	qt6-fourier	kwin-fourier	chromium-fourier	Browser	Notes
A	on	on	on	chromium-fourier binary	baseline; no revert
B	on	on	off	brave-bin 1.89	no package revert; binary swap only
C	off	on	on	chromium-fourier binary	revert qt6-base, logout+login
D	on	off	on	chromium-fourier binary	revert kwin, logout+login

N=3 reps per cell. 12 reps total. 70s window per rep, 60s steady-state at 5s–65s.

Repository

Local only for now (private working tree). Push to gitea later if the campaign reaches a useful conclusion worth publishing.

phase0_findings.md (uncurated)

Phase 0 — fourier_attribution

This file is the umbrella reference for Phase 0 work. Detail evidence files in phase0_evidence/:

`state_2026-05-03.md` — package + kernel + governor + session state on ohm
`test_rig_audit_2026-05-03.md` — predecessor test rig (`brave_drops_test.html`, `run_browser_nodebug.sh`, `post_process.sh`) audit, Cell B Brave invocation finalised, Step 2 feature-flag question resolved
`devfreq_probe_2026-05-03.md` — panfrost devfreq instrument verification

Research question (LOCKED 2026-05-03)

*“For each of qt6-fourier, kwin-fourier, and chromium-fourier on PineTab2 RK3568, is there a measurable, replicable delta on bbb 1080p first-60s playback when that package is toggled off (replaced by stock equivalent), versus the all-on baseline?”*

Mechanism the question targets

Each fourier package alters a different layer of the playback stack:

`chromium-fourier` ships Step 1 (libva-v4l2-request hantro multi-planar / chromium-149 era port) and Step 2 (Chromium WaylandConnection patch — overlay-route engages on KWin). Predecessor evidence (`ohm_gl_fix/phase3_remeasure_2026-05-02/task23_per_frame_route.md`) shows Step 2 engages with the chromium-fourier binary; whether engagement translates to measurable benefit is what this campaign tests.
`qt6-base-fourier` patches `qt6-base` for the `GL_ALPHA` stall. KWin links libQt6Gui; if KWin is the GL-composite step (per `wayland_baseline_2026-05-03/drmprobe_findings.md`, Plane 39 rotates ABGR8888 framebuffers under all reps), the qt6-fourier patch could affect KWin's per-frame work.
`kwin-fourier` patches KWin for the `watchDmaBuf` fence-wait issue. Direct effect on KWin's compositor scheduling.

The campaign does not commit to any of these mechanisms. It only measures whether toggling each off shifts the binding cells.

Predecessor close-out summary (context, not data)

`ohm_gl_fix` closed 2026-05-02 with the Step 2 WaylandConnection patch landed but Phase 1r `drops_post_warmup == 0` met *only* under cage. Spun off the residual cost into `kwin_overlay_subsurface`.
`kwin_overlay_subsurface` closed 2026-05-03 *without patch*. Premise was “cage = 0 floor at N=1 from Phase 0” — at N=3 in the closing session, the floor was missing (post-warmup median 26).
`x11-session-research` closed 2026-05-03 *with negative result on both axes*. X11 + xfwm4-no-comp produces *more* drops than Plasma Wayland with KWin compositing. X server doesn't program Plane 39 with NV12 regardless of client (mpv-xv, mpv-gpu — neither route engages).

The lesson from kwin_overlay_subsurface: don't anchor a campaign on N=1 historical numbers. This campaign acts on that lesson — every binding cell number comes from in-session reps acquired in this campaign's session.

Open questions before Phase 1 lock — resolved in Phase 0

Chromium-fourier off → Brave or stock Chromium 149? Resolved: Brave (operator sign-off 2026-05-03). Two-major-version delta (Brave 147 vs chromium-fourier 149) documented as known confound.
Reversibility on ohm — cell C/D revert path. Resolved: pacman cache has stock kwin-6.6.4-1 and qt6-base-6.11.0-2 at the *same upstream version* as the fourier-patched ones. Logout + auto-login per cell (operator instruction).
5s vs 10s warmup boundary. Resolved: 5s (operator instruction). Diverges from predecessor's 10s in `phase3_prime_runs/post_process.sh`.
Step 2 feature-flag opt-in name. Resolved: there is no flag. Step 2 patch is compiled-in and engages by default.
Window-size pin. Resolved: do not pin (mirror predecessor invocation; the video element is fixed at 800×450 in the .html anyway).
Where does the orchestrator live? Resolved: new `~/fourier-attribution/` dir on ohm, separate from predecessor's `phase3_prime_runs/`. Carries `run_browser_attribution.sh` (devfreq capture added) and `post_process_attribution.sh` (5s warmup boundary; mean-freq + peak-freq% derived from trans_stat diff).

What Phase 0 has produced

Locked research question + mechanism + experimental matrix (4 cells × N=3)
Predecessor test rig audit + Cell B Brave invocation
Panfrost devfreq instrument verification
Phase 0 in-session baseline anchor of cell A — *to be acquired in task #54*

Once #54 lands, Phase 1 binding cells lock with cell-A IQR widths in evidence, and the pass/fail thresholds are concrete.

phase0_evidence/baseline_a_2026-05-03.md (uncurated)

Phase 0 — cell A in-session baseline N=3 — 2026-05-03

Cell A = all three fourier packages on (qt6-base-fourier 1:6.11.0-3, kwin-fourier 1:6.6.4-3, chromium-fourier 149.0.7812.0 binary), Plasma Wayland, KWin compositor.

Ambient conditions during reps (per `start.txt`)

Kernel: `6.19.10-danctnix1-1-pinetab2`
Active session: tty2 Plasma Wayland (kwin_wayland PID 53655)
Operator's daily Brave (PID 58105 etc.) running in background, ~13% CPU idle. Documented as stable ambient confound across all three reps.
Reps run consecutively, each ~95s wall, with ~10s settle gap between
Reps timestamps: a_rep1 22:51 CEST, a_rep2 22:55 CEST, a_rep3 22:58 CEST

Per-rep summary (raw evidence in `baseline_a_2026-05-03/*_summary.txt`)

metric	a_rep1	a_rep2	a_rep3	min	max	range
drops_5s	10	11	7	7	11	4
drops_60s	20	15	15	15	20	5
drops_post_5s	10	4	8	4	10	6
effective_fps	23.99	23.99	24.00	23.99	24.00	0.01
frames_5s	121	121	121	121	121	0
frames_60s	1441	1442	1441	1441	1442	1
kwin_cpu_median	12.0	12.0	12.0	12.0	12.0	0.0
kwin_cpu_mean	11.97	12.02	11.98	11.97	12.02	0.05
kwin_cpu_iqr (per-rep)	1.8	2.0	1.0	1.0	2.0	1.0
browser_cpu_median	56.6	56.25	56.0	56.0	56.6	0.6
browser_cpu_mean	61.03	61.69	61.17	61.03	61.69	0.66
browser_cpu_iqr (per-rep)	9.6	9.2	5.9	5.9	9.6	3.7
panfrost_mean_freq_mhz	600.1	591.2	594.6	591.2	600.1	8.9
panfrost_peak_freq_pct	35.3	34.0	35.0	34.0	35.3	1.3
therm_pre_milliC	45000	45555	45555	45000	45555	555
therm_drift_c	10.6	7.8	7.8	7.8	10.6	2.8

Cell A IQR widths (locked for Phase 1 thresholds)

The N=3 range is used as a conservative IQR proxy (with N=3, p25 ≈ min, p75 ≈ max). Tighter than 95% CI but appropriate for the small N.

`drops_60s` range: 5
`effective_fps` range: 0.01 (effectively zero — very stable)
`kwin_cpu_median` range: 0.0 (exact tie across reps — top -d 1 rounds to integer percent at this CPU level)
`browser_cpu_median` range: 0.6
`panfrost_mean_freq_mhz` range: 8.9 (≈ 1.5% of mean)
`panfrost_peak_freq_pct` range: 1.3 (percentage points)

Observations

Effective fps is essentially locked at 24.0. The clip is 24 fps despite the `_30` in the filename. All three reps decode the full 60-second window's frames within 0.01 fps of each other. The test is *not* fps-bound; binding-cell discrimination must come from drops, CPU%, or panfrost freq.

CPU metrics are exceptionally stable. kwin_cpu_median is 12.0% in all three reps to 0.1% precision. browser_cpu_median spans 0.6 percentage points. This means tiny shifts when cells B/C/D run will be detectable.

panfrost_mean_freq_mhz is 591–600 MHz — well below the 800 MHz peak. Mali isn't pegged. The simple_ondemand governor is dialling between 400 / 600 / 800 MHz to track demand. Time at 800 MHz peak is consistently ~35% of the window.

drops_60s ranges 15–20. Predecessor's `kwin_timing_nodebug_rep1` reported drops_total=27 (no campaign-data import — cited only as broad ballpark verification). Our numbers cluster slightly lower; not surprising given ambient-state differences (different uptime, different daily-Brave activity, different cache state). The campaign-contained data discipline holds: this is OUR baseline; comparison is against THESE reps' IQR.

drops_post_5s spans 4–10. This is the largest relative variance (factor of 2.5×). Drops *during steady state* are the most rep-to-rep variable signal. Whether a cell-toggle moves this beyond the IQR will be the most stringent threshold test.

Thermal drift 7.8–10.6°C in 90s indicates the SoC is under genuine sustained load. No throttling triggered (max temp around 56°C; throttle threshold ~85°C).

Cell A IQR-based pass/fail thresholds (Phase 1 lock candidate)

A package P delivers measurable benefit if the P-off cell at N=3 exceeds ALL of the following deltas relative to the cell A median:

metric	cell A median	threshold delta (≈ N=3 range)	“measurable benefit if P-off exceeds”
drops_60s	15	+5	≥ 21
effective_fps	24.0	-0.05	≤ 23.95
kwin_cpu_median	12.0	+0.5	≥ 12.5 (ranges hit 0.0; 0.5 is conservative floor)
browser_cpu_median	56.0	+1	≥ 57.0
panfrost_mean_freq_mhz	595	+10	≥ 605

The thresholds are deliberately conservative on metrics where cell A IQR was 0 (kwin_cpu_median, panfrost_peak_freq_pct), inflated to 0.5–1 unit to avoid spurious-significance on rounding artefacts. A package needs to hit ANY ONE of the threshold deltas to count as having measurable benefit; a package that hits NONE is “no measurable benefit on this matrix” — explicit chaff.

Phase 0 task #54 verdict

Three reps acquired in-session, very tight on CPU/freq metrics, moderate variance on drops (consistent with Phase 3-prime predecessor experience that drops are the noisiest signal). Cell A baseline locked. Phase 1 binding-cell thresholds drafted above; lock when reading them back to operator.

Phase 0 task #54 = COMPLETE.

phase4_findings.md — cross-cell analysis (uncurated)

fourier_attribution — Cross-cell analysis 2026-05-03

Matrix executed in full: 4 cells × N=3 reps = 12 reps acquired in-session on ohm (PineTab2 RK3568, kernel 6.19.10-danctnix1-1-pinetab2, mesa 26.0.5, governor=performance, baloo off, daily Brave killed for clean ambient, autologin via campaign-temporary 99-autologin-fourier-attribution.conf since reverted).

Per-rep summary.txt and start.txt mirrored locally to phase4_evidence/{a,b,c,d}_rep[1-3]_*.

Per-rep table (raw)

	A1	A2	A3	B1	B2	B3	C1	C2	C3	D1	D2	D3
drops_5s	10	5	5	15	8	11	11	7	5	9	10	16
drops_60s	17	12	8	16	11	18	14	10	5	21	24	25
drops_post_5s	7	7	3	1	3	7	3	3	0	12	14	9
effective_fps	24.01	24.01	24.00	23.43	23.18	22.91	24.0	24.01	23.98	23.99	23.99	23.99
kwin_cpu_median	11.0	12.0	11.0	12.9	13.0	13.0	12.0	11.0	11.0	32.9	32.9	32.9
browser_cpu_median	54.25	54.4	54.9	137.3	140.65	137.15	60.2	54.15	52.2	64.25	63.3	64.05
panfrost_mean_freq_mhz	596.4	607.2	606.8	596.2	605.7	605.4	596.7	590.7	595.6	783.2	777.2	774.5
panfrost_peak_freq_pct	35.1	36.2	35.0	35.9	36.5	36.9	35.4	34.0	33.5	95.2	93.7	93.3
therm_drift_c	11.1	11.7	7.7	12.2	12.8	8.6	12.7	5.0	11.7	17.2	7.9	7.2

Cell medians

metric	A (baseline)	B (chromium-fourier off, Brave)	C (qt6-fourier off)	D (kwin-fourier off)
drops_60s	12	16	10	24
effective_fps	24.01	23.18	24.0	23.99
kwin_cpu_median	11.0	13.0	11.0	32.9
browser_cpu_median	54.4	137.15	54.15	64.05
panfrost_mean_freq_mhz	606.8	605.4	595.6	777.2
panfrost_peak_freq_pct	35.1	36.5	34.0	93.7

Per-cell verdict (against the IQR-based thresholds locked in `baseline_a_2026-05-03.md`)

Threshold rule: P-off cell delivers measurable benefit if the off-cell exceeds cell A median by:

drops_60s ±5 | effective_fps ±0.05 | kwin_cpu_median ±0.5 | browser_cpu_median ±1 | panfrost_mean_freq_mhz ±10

Cell B — chromium-fourier OFF (→ Brave 1.89)

metric	Δ vs A	beyond threshold?
drops_60s	+4	NO (within ±5)
effective_fps	−0.83	YES — Brave is significantly slower
kwin_cpu_median	+2.0	YES
browser_cpu_median	+82.75	YES (massive — 2.5×)
panfrost_mean_freq_mhz	−1.4	NO

Verdict: chromium-fourier delivers measurable benefit. 3 of 5 metrics fail when off.

Caveat: Cell B confounds *chromium-fourier patches* with *Brave-vs-Chromium-version* differences (Brave 1.89.145 = Chromium 147 base; chromium-fourier = Chromium 149 base). The +83pp browser CPU is plausibly part version-bump and part patch-engagement (chromium-fourier Step 1 + Step 2). Not separable in this matrix; would need stock-Chromium-149 build as additional cell.

Cell C — qt6-fourier OFF

metric	Δ vs A	beyond threshold?
drops_60s	−2	NO (and better by 2 — within IQR)
effective_fps	−0.01	NO
kwin_cpu_median	0	NO
browser_cpu_median	−0.25	NO
panfrost_mean_freq_mhz	−11.2	marginal wrong direction (off is lower GPU)

Verdict: qt6-fourier delivers NO measurable benefit on this workload. Zero of 5 metrics fail when off. The single marginal panfrost delta is in the wrong direction (less GPU work without qt6-fourier), suggesting qt6-fourier may even have a tiny GPU cost on this workload — though the magnitude is at the IQR edge.

Caveat: qt6-fourier patches qt6-base for the GL_ALPHA stall. That stall may not be triggered by 1080p H.264 NV12 video playback in chromium-fourier; the patch could still help other workloads (different video formats, mixed application loads, etc.) not in scope here.

Cell D — kwin-fourier OFF

metric	Δ vs A	beyond threshold?
drops_60s	+12	YES (>2× threshold)
effective_fps	−0.02	NO
kwin_cpu_median	+21.9	YES (massive — 3×)
browser_cpu_median	+9.65	YES
panfrost_mean_freq_mhz	+170.4	YES (massive — Mali jumps from ~600 to ~775 MHz)
panfrost_peak_freq_pct	+58.7	(would be massive YES if it were a primary threshold)

Verdict: kwin-fourier delivers MASSIVE measurable benefit. 4 of 5 primary metrics fail when off, with the kwin-CPU and panfrost-freq deltas being huge.

This is consistent with the predecessor's diagnosis that kwin-fourier patches a watchDmaBuf fence-wait issue. Without the patch, KWin spins waiting on dmabuf fences that should already be signaled, doing extra GL composite work, driving Mali to near-max all the time. The packaged fix is doing real work on this workload.

Wheat-vs-chaff verdict

WHEAT (measurable benefit on bbb 1080p H.264 60s playback, ohm/PineTab2):

kwin-fourier — load-bearing. Removing it triples kwin CPU, drives Mali GPU to 95% peak-freq residency, doubles drops_60s. Likely the single most impactful fourier package on this hardware/workload.
chromium-fourier — measurable benefit, magnitude inflated by Brave-vs-Chromium-149 version confound. Real signal exists (3 of 5 metrics fail when swapped to Brave); cleanest separation would need a stock-Chromium-149 control cell, out of scope here.

CHAFF (no measurable benefit on this specific workload, ohm/PineTab2):

qt6-fourier — zero of 5 metrics moved beyond cell A IQR when removed. Cell C medians are within rep-to-rep noise of cell A. The GL_ALPHA stall the patch addresses doesn't trigger in this scenario. *Not a statement about other workloads*; just chaff for this specific binding-cell set.

Open questions raised

Brave-vs-Chromium-149 confound in cell B. Could be resolved with a stock Chromium 149 build as a fifth cell. Out of campaign scope.
qt6-fourier on other workloads. This campaign rules out benefit for bbb 1080p H.264 60s in chromium-fourier; doesn't say anything about Firefox, mpv, mixed Plasma desktop activity, or different video formats. If the user wants qt6-fourier maintained, it'd be on the merits of a different workload set.
Cell C had wider rep-to-rep variance than expected (drops_60s 5/10/14, browser_cpu_median 52.2/54.15/60.2). May be ambient drift in the new session post-package-swap; not enough to change the verdict but flagged for honesty.
All-off cell. Not measured. Combined effect of kwin+qt6+chromium being all reverted not in matrix; predecessor evidence implies it'd be ≈cell D (kwin dominant) but unknown.

Phase 8 memory hooks worth carrying forward

*Single-toggle attribution matrix on small N=3 with clean ambient is enough to rank fourier-flavoured packages by impact on a specific workload* — this campaign in 4 hours gave answers the original fourier campaign hadn't separated in months of “fix this fix that”.
*kwin-fourier `watchDmaBuf` fix is the load-bearing one for video on rockchip* (RK3568). Removing it is the failure mode. Should not be removed casually.
*qt6-fourier GL_ALPHA fix may not bind on workloads that don't trigger it.* Don't ship it as critical without identifying the workloads where it actually helps.
*Brave-vs-Chromium version confound in matrix design* — when “package off” requires using a different upstream binary, document the version skew explicitly and note it as a confound; don't pretend the swap is clean.

phase4_evidence — per-rep summary.txt (raw, uncurated)

a_rep1_summary.txt

REP_ID=a_rep1
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=10
drops_60s=17
drops_post_5s=7
frames_5s=124
frames_60s=1448
effective_fps=24.01
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=11.0
kwin_cpu_mean=11.65
kwin_cpu_min=9.9
kwin_cpu_max=19.0
kwin_cpu_p25=11.0
kwin_cpu_p75=12.0
kwin_cpu_iqr=1.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=54.25
browser_cpu_mean=58.59
browser_cpu_min=49.1
browser_cpu_max=115.7
browser_cpu_p25=52.2
browser_cpu_p75=59.1
browser_cpu_iqr=6.9
--- panfrost devfreq (window) ---
panfrost_total_window_ms=74681
panfrost_mean_freq_mhz=596.4
panfrost_peak_freq_pct=35.1
panfrost_residency_200MHz_ms=1356
panfrost_residency_300MHz_ms=485
panfrost_residency_400MHz_ms=25080
panfrost_residency_600MHz_ms=19625
panfrost_residency_700MHz_ms=1897
panfrost_residency_800MHz_ms=26238
--- thermal ---
therm_pre_milliC=46111
therm_post_milliC=57222
therm_drift_c=11.1

a_rep2_summary.txt

REP_ID=a_rep2
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=5
drops_60s=12
drops_post_5s=7
frames_5s=124
frames_60s=1447
effective_fps=24.01
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=12.0
kwin_cpu_mean=12.16
kwin_cpu_min=10.0
kwin_cpu_max=17.0
kwin_cpu_p25=11.0
kwin_cpu_p75=13.0
kwin_cpu_iqr=2.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=54.4
browser_cpu_mean=59.76
browser_cpu_min=49.4
browser_cpu_max=115.7
browser_cpu_p25=53.0
browser_cpu_p75=60.0
browser_cpu_iqr=7.0
--- panfrost devfreq (window) ---
panfrost_total_window_ms=73875
panfrost_mean_freq_mhz=607.2
panfrost_peak_freq_pct=36.2
panfrost_residency_200MHz_ms=1294
panfrost_residency_300MHz_ms=207
panfrost_residency_400MHz_ms=22356
panfrost_residency_600MHz_ms=20947
panfrost_residency_700MHz_ms=2314
panfrost_residency_800MHz_ms=26757
--- thermal ---
therm_pre_milliC=47777
therm_post_milliC=59444
therm_drift_c=11.7

a_rep3_summary.txt

REP_ID=a_rep3
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=5
drops_60s=8
drops_post_5s=3
frames_5s=125
frames_60s=1448
effective_fps=24.0
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=11.0
kwin_cpu_mean=11.51
kwin_cpu_min=10.0
kwin_cpu_max=17.0
kwin_cpu_p25=11.0
kwin_cpu_p75=12.0
kwin_cpu_iqr=1.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=54.9
browser_cpu_mean=58.36
browser_cpu_min=49.3
browser_cpu_max=124.8
browser_cpu_p25=52.4
browser_cpu_p75=57.3
browser_cpu_iqr=4.9
--- panfrost devfreq (window) ---
panfrost_total_window_ms=74432
panfrost_mean_freq_mhz=606.8
panfrost_peak_freq_pct=35.9
panfrost_residency_200MHz_ms=1302
panfrost_residency_300MHz_ms=309
panfrost_residency_400MHz_ms=21911
panfrost_residency_600MHz_ms=22616
panfrost_residency_700MHz_ms=1545
panfrost_residency_800MHz_ms=26749
--- thermal ---
therm_pre_milliC=50625
therm_post_milliC=58333
therm_drift_c=7.7

b_rep1_summary.txt

REP_ID=b_rep1
KIND=brave-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=15
drops_60s=16
drops_post_5s=1
frames_5s=123
frames_60s=1414
effective_fps=23.43
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=12.9
kwin_cpu_mean=12.5
kwin_cpu_min=8.9
kwin_cpu_max=16.0
kwin_cpu_p25=11.9
kwin_cpu_p75=13.0
kwin_cpu_iqr=1.1
--- brave aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=137.3
browser_cpu_mean=151.89
browser_cpu_min=92.3
browser_cpu_max=240.9
browser_cpu_p25=125.2
browser_cpu_p75=183.1
browser_cpu_iqr=57.9
--- panfrost devfreq (window) ---
panfrost_total_window_ms=73944
panfrost_mean_freq_mhz=596.2
panfrost_peak_freq_pct=35.9
panfrost_residency_200MHz_ms=2321
panfrost_residency_300MHz_ms=1196
panfrost_residency_400MHz_ms=22807
panfrost_residency_600MHz_ms=18447
panfrost_residency_700MHz_ms=2649
panfrost_residency_800MHz_ms=26524
--- thermal ---
therm_pre_milliC=48333
therm_post_milliC=60555
therm_drift_c=12.2

b_rep2_summary.txt

REP_ID=b_rep2
KIND=brave-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=8
drops_60s=11
drops_post_5s=3
frames_5s=115
frames_60s=1392
effective_fps=23.18
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=13.0
kwin_cpu_mean=12.99
kwin_cpu_min=9.0
kwin_cpu_max=16.0
kwin_cpu_p25=12.0
kwin_cpu_p75=14.0
kwin_cpu_iqr=2.0
--- brave aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=140.65
browser_cpu_mean=151.2
browser_cpu_min=93.0
browser_cpu_max=258.8
browser_cpu_p25=124.6
browser_cpu_p75=176.4
browser_cpu_iqr=51.8
--- panfrost devfreq (window) ---
panfrost_total_window_ms=75308
panfrost_mean_freq_mhz=605.7
panfrost_peak_freq_pct=36.5
panfrost_residency_200MHz_ms=2427
panfrost_residency_300MHz_ms=1224
panfrost_residency_400MHz_ms=20508
panfrost_residency_600MHz_ms=19969
panfrost_residency_700MHz_ms=3684
panfrost_residency_800MHz_ms=27496
--- thermal ---
therm_pre_milliC=48888
therm_post_milliC=61666
therm_drift_c=12.8

b_rep3_summary.txt

REP_ID=b_rep3
KIND=brave-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=11
drops_60s=18
drops_post_5s=7
frames_5s=104
frames_60s=1367
effective_fps=22.91
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=13.0
kwin_cpu_mean=12.96
kwin_cpu_min=9.0
kwin_cpu_max=19.9
kwin_cpu_p25=12.0
kwin_cpu_p75=14.0
kwin_cpu_iqr=2.0
--- brave aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=137.15
browser_cpu_mean=155.17
browser_cpu_min=98.8
browser_cpu_max=286.0
browser_cpu_p25=127.5
browser_cpu_p75=177.7
browser_cpu_iqr=50.2
--- panfrost devfreq (window) ---
panfrost_total_window_ms=75129
panfrost_mean_freq_mhz=605.4
panfrost_peak_freq_pct=36.9
panfrost_residency_200MHz_ms=3629
panfrost_residency_300MHz_ms=1270
panfrost_residency_400MHz_ms=18742
panfrost_residency_600MHz_ms=19345
panfrost_residency_700MHz_ms=4434
panfrost_residency_800MHz_ms=27709
--- thermal ---
therm_pre_milliC=52500
therm_post_milliC=61111
therm_drift_c=8.6

c_rep1_summary.txt

REP_ID=c_rep1
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=11
drops_60s=14
drops_post_5s=3
frames_5s=125
frames_60s=1449
effective_fps=24.0
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=12.0
kwin_cpu_mean=11.76
kwin_cpu_min=10.0
kwin_cpu_max=16.0
kwin_cpu_p25=11.0
kwin_cpu_p75=12.0
kwin_cpu_iqr=1.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=60.2
browser_cpu_mean=62.12
browser_cpu_min=49.8
browser_cpu_max=115.1
browser_cpu_p25=57.6
browser_cpu_p75=63.3
browser_cpu_iqr=5.7
--- panfrost devfreq (window) ---
panfrost_total_window_ms=76199
panfrost_mean_freq_mhz=596.7
panfrost_peak_freq_pct=35.4
panfrost_residency_200MHz_ms=2551
panfrost_residency_300MHz_ms=1297
panfrost_residency_400MHz_ms=22910
panfrost_residency_600MHz_ms=19041
panfrost_residency_700MHz_ms=3415
panfrost_residency_800MHz_ms=26985
--- thermal ---
therm_pre_milliC=50625
therm_post_milliC=63333
therm_drift_c=12.7

c_rep2_summary.txt

REP_ID=c_rep2
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=7
drops_60s=10
drops_post_5s=3
frames_5s=125
frames_60s=1449
effective_fps=24.01
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=11.0
kwin_cpu_mean=11.24
kwin_cpu_min=9.0
kwin_cpu_max=16.0
kwin_cpu_p25=10.9
kwin_cpu_p75=12.0
kwin_cpu_iqr=1.1
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=54.15
browser_cpu_mean=58.84
browser_cpu_min=49.3
browser_cpu_max=184.9
browser_cpu_p25=52.3
browser_cpu_p75=57.7
browser_cpu_iqr=5.4
--- panfrost devfreq (window) ---
panfrost_total_window_ms=74799
panfrost_mean_freq_mhz=590.7
panfrost_peak_freq_pct=34.0
panfrost_residency_200MHz_ms=1860
panfrost_residency_300MHz_ms=416
panfrost_residency_400MHz_ms=25766
panfrost_residency_600MHz_ms=18917
panfrost_residency_700MHz_ms=2379
panfrost_residency_800MHz_ms=25461
--- thermal ---
therm_pre_milliC=58333
therm_post_milliC=63333
therm_drift_c=5.0

c_rep3_summary.txt

REP_ID=c_rep3
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=5
drops_60s=5
drops_post_5s=0
frames_5s=125
frames_60s=1447
effective_fps=23.98
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=11.0
kwin_cpu_mean=10.96
kwin_cpu_min=9.0
kwin_cpu_max=15.0
kwin_cpu_p25=11.0
kwin_cpu_p75=11.0
kwin_cpu_iqr=0.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=52.2
browser_cpu_mean=54.9
browser_cpu_min=46.7
browser_cpu_max=135.9
browser_cpu_p25=51.1
browser_cpu_p75=55.2
browser_cpu_iqr=4.1
--- panfrost devfreq (window) ---
panfrost_total_window_ms=73269
panfrost_mean_freq_mhz=595.6
panfrost_peak_freq_pct=33.5
panfrost_residency_200MHz_ms=1681
panfrost_residency_300MHz_ms=311
panfrost_residency_400MHz_ms=23541
panfrost_residency_600MHz_ms=20735
panfrost_residency_700MHz_ms=2468
panfrost_residency_800MHz_ms=24533
--- thermal ---
therm_pre_milliC=50000
therm_post_milliC=61666
therm_drift_c=11.7

d_rep1_summary.txt

REP_ID=d_rep1
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=9
drops_60s=21
drops_post_5s=12
frames_5s=124
frames_60s=1448
effective_fps=23.99
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=32.9
kwin_cpu_mean=33.44
kwin_cpu_min=30.8
kwin_cpu_max=39.9
kwin_cpu_p25=32.8
kwin_cpu_p75=33.9
kwin_cpu_iqr=1.1
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=64.25
browser_cpu_mean=68.74
browser_cpu_min=57.0
browser_cpu_max=107.9
browser_cpu_p25=61.2
browser_cpu_p75=71.6
browser_cpu_iqr=10.4
--- panfrost devfreq (window) ---
panfrost_total_window_ms=75111
panfrost_mean_freq_mhz=783.2
panfrost_peak_freq_pct=95.2
panfrost_residency_200MHz_ms=1369
panfrost_residency_300MHz_ms=103
panfrost_residency_400MHz_ms=208
panfrost_residency_600MHz_ms=1134
panfrost_residency_700MHz_ms=780
panfrost_residency_800MHz_ms=71517
--- thermal ---
therm_pre_milliC=45555
therm_post_milliC=62777
therm_drift_c=17.2

d_rep2_summary.txt

REP_ID=d_rep2
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=10
drops_60s=24
drops_post_5s=14
frames_5s=124
frames_60s=1448
effective_fps=23.99
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=32.9
kwin_cpu_mean=33.14
kwin_cpu_min=29.7
kwin_cpu_max=38.8
kwin_cpu_p25=31.9
kwin_cpu_p75=33.9
kwin_cpu_iqr=2.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=63.3
browser_cpu_mean=68.91
browser_cpu_min=56.2
browser_cpu_max=127.7
browser_cpu_p25=61.2
browser_cpu_p75=67.6
browser_cpu_iqr=6.4
--- panfrost devfreq (window) ---
panfrost_total_window_ms=79234
panfrost_mean_freq_mhz=777.2
panfrost_peak_freq_pct=93.7
panfrost_residency_200MHz_ms=161
panfrost_residency_300MHz_ms=105
panfrost_residency_400MHz_ms=3715
panfrost_residency_600MHz_ms=727
panfrost_residency_700MHz_ms=261
panfrost_residency_800MHz_ms=74265
--- thermal ---
therm_pre_milliC=53750
therm_post_milliC=61666
therm_drift_c=7.9

d_rep3_summary.txt

REP_ID=d_rep3
KIND=chromium-fourier-kwin
end_seen=True
trajectory_samples=71
--- drops ---
drops_5s=16
drops_60s=25
drops_post_5s=9
frames_5s=124
frames_60s=1447
effective_fps=23.99
--- kwin_wayland CPU%% (5s-65s) ---
kwin_cpu_n=60
kwin_cpu_median=32.9
kwin_cpu_mean=33.03
kwin_cpu_min=27.9
kwin_cpu_max=40.2
kwin_cpu_p25=31.9
kwin_cpu_p75=33.9
kwin_cpu_iqr=2.0
--- chrome aggregate CPU%% (5s-65s) ---
browser_cpu_n=60
browser_cpu_median=64.05
browser_cpu_mean=70.36
browser_cpu_min=58.0
browser_cpu_max=148.0
browser_cpu_p25=62.1
browser_cpu_p75=70.9
browser_cpu_iqr=8.8
--- panfrost devfreq (window) ---
panfrost_total_window_ms=78666
panfrost_mean_freq_mhz=774.5
panfrost_peak_freq_pct=93.3
panfrost_residency_200MHz_ms=326
panfrost_residency_300MHz_ms=463
panfrost_residency_400MHz_ms=3576
panfrost_residency_600MHz_ms=621
panfrost_residency_700MHz_ms=262
panfrost_residency_800MHz_ms=73418
--- thermal ---
therm_pre_milliC=56111
therm_post_milliC=63333
therm_drift_c=7.2

Representative start.txt — package state per cell at rep time

a_rep1_start.txt

2026-05-03T21:09:44Z
REP_ID=a_rep1 KIND=chromium-fourier-kwin
BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome
Chromium 149.0.7812.0 
qt6-base-fourier 1:6.11.0-3
qt6-base-fourier 1:6.11.0-3
kwin-fourier 1:6.6.4-3
kwin-fourier 1:6.6.4-3
6.19.10-danctnix1-1-pinetab2
workload_pid=67340
workload_pgid=67340
script_pgid=67329
AUTOPLAY at 2026-05-03T21:09:49Z (took 5 s)
kwin_wayland_pid=53655

b_rep1_start.txt

2026-05-03T21:14:21Z
REP_ID=b_rep1 KIND=brave-kwin
BROWSER_BIN=/usr/bin/brave
Brave Browser 147.1.89.145 
qt6-base-fourier 1:6.11.0-3
qt6-base-fourier 1:6.11.0-3
kwin-fourier 1:6.6.4-3
kwin-fourier 1:6.6.4-3
6.19.10-danctnix1-1-pinetab2
workload_pid=69443
workload_pgid=69443
script_pgid=69432
AUTOPLAY at 2026-05-03T21:14:26Z (took 5 s)
kwin_wayland_pid=53655

c_rep1_start.txt

2026-05-03T22:01:37Z
REP_ID=c_rep1 KIND=chromium-fourier-kwin
BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome
Chromium 149.0.7812.0 
qt6-base 6.11.0-2
kwin-fourier 1:6.6.4-3
kwin-fourier 1:6.6.4-3
6.19.10-danctnix1-1-pinetab2
workload_pid=82947
workload_pgid=82947
script_pgid=82936
AUTOPLAY at 2026-05-03T22:01:43Z (took 6 s)
kwin_wayland_pid=82420

d_rep1_start.txt

2026-05-03T21:50:32Z
REP_ID=d_rep1 KIND=chromium-fourier-kwin
BROWSER_BIN=/tmp/chromium-ohm-gl-fix-step2/chrome
Chromium 149.0.7812.0 
qt6-base-fourier 1:6.11.0-3
qt6-base-fourier 1:6.11.0-3
kwin 6.6.4-1
6.19.10-danctnix1-1-pinetab2
workload_pid=79834
workload_pgid=79834
script_pgid=79823
AUTOPLAY at 2026-05-03T21:50:37Z (took 5 s)
kwin_wayland_pid=77581

Reviewer brief

Per Phase 5 of the dev process: Claude is forbidden from curating this artifact. The reviewer is asked to read the raw documents above and surface anything the active session may have rationalised away — methodology gaps, threshold-setting bias, confounds the campaign acknowledged but didn't quantify (e.g. Brave-vs-Chromium-149 version delta in cell B), and any path from this matrix to a useful next campaign.

Phase 5 reviewer response (Sonnet architect, 2026-05-04)

Engaged via Plan subagent, model: sonnet override, open-consultation mode. Active-session prompt forbade curating the artifact; reviewer was explicitly asked to argue with the verdicts.

Phase 5 Review — fourier_attribution 2026-05-03

Reviewer: Sonnet architect (subagent, fresh-model invocation 2026-05-04, open-consultation mode). Engaged via Plan subagent with model: sonnet override per the kwin_overlay_subsurface precedent. Review prompt explicitly forbade the active session from curating the artifact going to the reviewer; reviewer was given paths to local repo + DokuWiki page + asked to argue with the verdicts.

§1 Methodology concerns

The cell ordering and baseline recycling are the biggest structural problem.

Cell A (all-fourier-on baseline) was run in the phase0 evidence set (reps starting ~20:57Z), then those reps were discarded in favour of a *second* N=3 cell-A run in the phase4 evidence set (reps starting ~21:09Z, i.e. ~12 minutes later, same kwin PID 53655). The phase4 cell A has measurably different numbers from the phase0 cell A: drops_60s fell from {20,15,15} to {17,12,8}, and browser_cpu_median fell from {56.6,56.25,56.0} to {54.25,54.4,54.9}. Both moves are in the “looks better” direction. Thresholds were locked against the *phase0* cell A medians and ranges (drops_60s median=15, browser_cpu_median=56.0), but the phase4 cell A values used as the comparison baseline in phase4_findings.md are the *lower* set (median drops_60s=12, browser_cpu_median=54.4). The analysis then computes deltas against “A median = 12 drops / 54.4 browser CPU” while the pass/fail thresholds were set against a baseline of “15 drops / 56.0 browser CPU.” This is not called out anywhere in phase4_findings.md, and the arithmetic is self-inconsistent as a result.

This matters concretely for cell B: the “drops_60s = +4 vs A” is computed as B_median(16) − A_phase4_median(12) = +4. But the threshold was set as “phase0 median + 5 = 20.” If the threshold anchor had been re-locked at the phase4 cell A, B's drops_60s delta would still be +4 (which is fine, still below 5). For the fps verdict the delta -0.83 is large either way. So this inconsistency doesn't change the cell-B verdict in this case — but it should have been called out, and it opens the door to the question of which cell A is “the” baseline.

The metric set has a blind spot: no per-frame GPU submit latency or fence wait time. The active session noted that wp_presentation_feedback would yield Δ_present; more directly useful would be a simple count of how many top -d 1 kwin samples are above, say, 20% CPU (i.e. a “high-load fraction” rather than just median). The median of 32.9% in cell D is so high it doesn't need better instrumentation — but for borderline cases like cell B's kwin_cpu_median of 13.0% vs threshold of 12.5%, per-frame feedback would distinguish “systematically higher” from “a few high spikes pulling up the median.” The campaign correctly excludes wp_presentation_feedback as out-of-scope, but it should have flagged that the kwin_cpu_median in cell B is at the threshold boundary and would benefit from more resolution.

No cell E (all-fourier-off). The campaign acknowledges this. The combination matters because if kwin-fourier and chromium-fourier both being off produces a qualitatively different failure mode than kwin-fourier alone being off, the single-toggle matrix won't surface it. For practical maintenance decisions (can we drop all three packages?) this is the directly relevant cell and it's missing.

Execution order: A→B→D→C. The full sequence was phase0-A (20:57Z), phase4-A (21:09Z), B (21:14Z), D (21:50Z), C (22:01Z). Cell D ran ~37 minutes after cell B in the same continuous uptime session (same kwin PID 53655 was still alive through D). Cell C ran on a new session (kwin PID 82420, different from 53655 and 77581). The campaign notes this in phase4_findings “open questions” as potential ambient drift in cell C, but it doesn't ask the symmetric question about whether the A/B/D sequence on a continuous session introduced a drift pattern into those three cells that wouldn't apply to cell C.

§2 Threshold concerns

The kwin_cpu_median threshold bump from 0 to 0.5 is defensible as a decision but not as stated. The phase0 cell A had kwin_cpu_median = 12.0 in all three reps (range 0.0). The justification for inflating the threshold to ±0.5 is that top -d 1 rounds to integer percent. That is correct as far as it goes, but the implication is wrong: if top rounds to integers, then any non-zero difference between cells will be at least 1.0, not 0.5. Setting the threshold at 12.5 is operationally equivalent to setting it at 12.99 — the only values top will return are 12.0 or 13.0, never 12.5. So the 0.5 threshold is conservative in the right direction (avoids false positives) but its stated rationale (“floor because top rounds”) is imprecise. It effectively means “any cell with median ≥ 13.0 triggers the threshold.” Cell B's median is 13.0, which is exactly at the trigger. If the threshold had been stated correctly as “rounds-to-1%” it would have been ±1.0, and cell B's kwin_cpu delta of +2.0 would still trigger but you'd be clearer about what you're measuring.

The cell-A median used in phase4_findings.md is from the phase4 A reps, but the IQR thresholds were locked from the phase0 A reps. This means the anchor point (what “A median” means) is different in the threshold table vs the delta computation table. Specifically: drops_60s threshold was “≥21 to trigger” (phase0 median 15 + delta 5 = 20; but the wording says ≥21). Cell D's drops_60s median is 24, which clears it either way. Cell B's drops_60s delta of +4 is computed as B(16) − A_phase4(12), giving +4, which is below the ±5 threshold. But B(16) − A_phase0(15) = +1, even more clearly below threshold. So this inconsistency doesn't change any verdict, but the pass/fail table in phase4_findings.md cites “+4 = within ±5” while the drops floor it's compared against is the phase4 median (12), not the threshold anchor (15). The documentation is subtly self-contradictory and would mislead anyone trying to verify the math.

panfrost_mean_freq threshold of ±10 MHz is derived from the phase0 range of 8.9 MHz, rounded up to 10. Reasonable. The phase4 cell A range is 607.2 − 596.4 = 10.8 MHz, slightly wider than the phase0 range. If the phase4 range had been used as the threshold, it would have rounded to ±11 MHz, and cell C's −11.2 delta would still be exactly at or below threshold (rather than “marginal wrong direction”). This has no verdict impact since cell C is chaff by all other metrics, but it illustrates the phase0-vs-phase4 baseline confusion again.

§3 Confounds the campaign missed

The “daily Brave ambient” was stable across A and B but vanished for D and C. The phase0 state snapshot notes “operator's daily Brave (PID 58105 etc.) running in background, ~13% CPU idle. Documented as stable ambient confound across all three [phase0 A] reps.” In phase4_findings.md, the README says “daily Brave killed for clean ambient” — so by the time the phase4 matrix was run, the ambient Brave was gone. This means the phase4 A reps are running in a cleaner ambient than the phase0 A reps, which is probably why browser_cpu_median dropped from ~56 to ~54 (the background Brave is no longer consuming CPU that could have been attributed to the workload browser's process group, depending on how aggregation works). This is explicitly tracked in phase0 as a “stable confound” and it was apparently resolved before phase4 — but the two A-rep runs are never reconciled or explained. Why was cell A run twice?

Session age asymmetry within cell D. This is the one the active session noticed. But it didn't note the directional implication: cell D reps started at 21:50Z — roughly 47 minutes after the first phase4 cell A rep (21:09Z). The kwin_wayland PID (77581) for cell D is a fresh post-revert session. If a fresh KWin session has a warm-up cost (dbus listeners registering, initial frame pipeline setup, shader compilation) that settles within a few minutes, and all three D reps showed near-identical kwin CPU of ~32.9%, then this is a steady state, not a fresh-session artifact. The signal is too large and too stable across reps to be a warm-up artifact. But the campaign missed the inverse question: are cells A and B *benefiting from an older, warmed-up compositor state* (kwin PID 53655 had been running since before the phase0 runs, which started at ~20:57Z, i.e. kwin was at least ~35 minutes old before any data was collected)? An older KWin session will have shader caches warm, GL state cached, etc. Cell D's fresh KWin might genuinely be doing more work because it hasn't compiled/cached what kwin-fourier-patched KWin had already compiled by the time cell A ran. This is a concrete alternative explanation for part of the cell D vs A delta that is not the kwin-fourier patch itself.

However, the shader-cache explanation for cell D's 32.9% vs 11% kwin CPU is implausible: shader compilation is a one-time cost per boot/profile, not a 60-second continuous burn. The sustained 95% GPU peak-freq residency in cell D is not consistent with shader compilation overhead; it's consistent with the watchDmaBuf spin diagnosis. So this confound is unlikely to explain cell D's result, but it should have been named and dismissed explicitly.

cell D ran before cell C, meaning when the kwin revert happened, qt6-base was still the fourier version. Cell C ran with stock qt6-base but fourier kwin. The sequence was: all-on → chromium-off (B) → kwin-off (D) → qt6-off (C). After cell C finished, both kwin and qt6 were presumably reverted (or cell C reverted qt6, and… what happened to kwin for cell C?). The c_rep1_start.txt shows kwin-fourier 1:6.6.4-3 is present, which means between cell D (kwin-off) and cell C (qt6-off), kwin-fourier was reinstalled. The revert log only shows kwin revert, no kwin reinstall log. This means there was a reinstall step between D and C that isn't in evidence. That's not a confound per se — the start.txt confirms packages were verified per-rep — but the session order (D before C) means cell C ran on a KWin session that was created *after* the kwin-fourier reinstall + logout+login. The phase0 notes describe autologin via 99-autologin-fourier-attribution.conf — and the kwin PID changes confirm each logout+login created a fresh session. So cell C is the *second* fresh session (after D), and cell A/B are on the *original long-uptime session*. This makes A/B vs C/D a confounded comparison on session age independent of the package toggle.

§4 Verdict robustness per package

kwin-fourier: WHEAT, robust.

The signal is 3× kwin CPU (11→33%), 170 MHz mean GPU freq jump (~28% of scale), 95% peak-freq residency (vs 35%), and 2× drops. All three cell D reps are near-identical — kwin_cpu_median is exactly 32.9% in all three, panfrost_mean_freq is 783/777/775. This is the tightest cell in the whole matrix and the largest delta. No plausible confound explains this away: the shader-cache alternative is dismissed above (it doesn't produce a 60-second continuous GPU burn at 95% peak residency). The session-age asymmetry (fresh D vs long-uptime A) would, if anything, help cell D by having a cleaner cache state, but it doesn't explain the sustained GPU saturation.

Confidence: very high. If the verdict here were wrong, you'd need to argue that kwin PID 77581's fresh session caused the GPU to continuously max-freq for the entire 60-second window for a reason unrelated to the kwin packages, which is not credible.

chromium-fourier: WHEAT-but-fragile verdict.

The claimed evidence is: fps −0.83 (Brave 22.91–23.43 vs chromium-fourier 24.0–24.01), browser_cpu_median +82.75pp (137 vs 54.4), kwin_cpu_median +2.0 (13.0 vs 11.0). All three are confounded by the Brave-vs-Chromium-149 version gap.

The fps delta is the clearest problem. Cell B fps values are {23.43, 23.18, 22.91} — a declining trend across the three reps (B1→B2→B3, each about 0.25 fps lower). Cell A's fps is locked at 24.00–24.01. This declining trend across Brave reps suggests something is drifting within the Brave session (progressive video decoder stall? buffer pressure?), not a stable property of the browser. The effective_fps metric in the extractor is computed as (frames_60s - frames_5s) / (s60[0] - s5[0]), and cell B's frames_60s are {1414, 1392, 1367} — genuinely delivering fewer frames, not just a measurement artifact. Whether this is a Chromium-147 decoder limitation, a Brave-specific regression vs Chromium-149, or a chromium-fourier patch effect cannot be separated.

The browser_cpu_median of 137pp vs 54.4pp is a 2.5× gap. This is real. Brave with Chromium-147 base is consuming 2.5× more CPU for the same workload. But Chromium-147 vs 149 is a two-major-version delta, which can easily explain multi-×100pp CPU differences in a decoder-heavy workload (codec path changes, VA-API usage patterns, zero-copy buffer handling). The chromium-fourier patches (Step 1 = libva-v4l2-request port, Step 2 = WaylandConnection overlay-route) are precisely the kind of changes that would reduce browser CPU by enabling hardware decode paths — but you cannot tell from this matrix whether those paths are also present in Brave-147, absent in Brave-147, or partially present with different efficiency.

The kwin_cpu_median of +2.0 (11→13%) is, as noted above, right at the rounding threshold. It's suggestive that Brave presents frames less efficiently to KWin, but at N=3 with integer-rounded values, it's barely more than a 1-sample wide signal.

My independent verdict: call it WHEAT-suspected-but-unconfirmed. The direction is clear, the magnitude is large, but the control comparison is the wrong browser at the wrong version. The campaign's own caveat is correct — you cannot call this “chromium-fourier delivers benefit” cleanly; you can only call it “chromium-fourier + chromium-149 base is substantially better than Brave-1.89/Chromium-147 on this workload.” The confound is load-bearing. I would not ship the chromium-fourier conclusion to anyone making a package maintenance decision without the Chromium-149 vanilla control.

qt6-fourier: CHAFF on this workload, verdict sound.

Zero of five metrics moved beyond threshold when qt6-base was reverted to stock. The panfrost mean freq delta is −11.2 MHz (slightly *lower* GPU usage without qt6-fourier), which is the wrong direction for “the patch helps.” Cell C reps have wider variance than cell A (drops_60s 5/10/14, browser_cpu 52.2/54.15/60.2), which the campaign correctly flags. However, c_rep1's browser_cpu_median is 60.2 — which is 5.8 above the cell A baseline of 54.4 — and the threshold is “+1.” If all three C reps had been like c_rep1, cell C would have been a false-positive wheat verdict on browser_cpu. The fact that c_rep2 (54.15) and c_rep3 (52.2) are both at or below baseline suppresses this. The variance in cell C is real and should be noted as a reliability concern for the qt6-fourier verdict, not just flagged as “wider than expected.” Had the campaign run N=5 for cell C, or had c_rep1's values been closer to the mean, the verdict might have been uncertain rather than confidently chaff.

The workload-specificity caveat is well-stated. “CHAFF on bbb 1080p H.264 Chromium-149” is correct. “CHAFF generally” is not supported.

§5 Cheapest next campaign

Run a Chromium-149 vanilla control cell (cell E) to de-confound the chromium-fourier verdict.

This is the single highest-value next step. The action is: obtain or build a stock Chromium-149 binary (without the Step 1 libva-v4l2-request port and without the Step 2 WaylandConnection overlay-route patches) and run it as cell E with all three fourier packages on. Compare against cell A (chromium-fourier on) and cell B (Brave-147).

Cost: the main effort is building or obtaining Chromium-149 vanilla for aarch64. The predecessor ohm_gl_fix campaign built chromium-fourier from source, so a same-version vanilla build is feasible — it's the same build without applying the fourier patches. If the build artifacts from the ohm_gl_fix campaign are still around, a stripped binary might be constructible faster. Alternatively, checking whether the Arch aarch64 Chromium package (not fourier-patched) is at version 149 would give a zero-effort control — if it's already 149 in the repo, pacman -S chromium may be sufficient.

If cell E shows vanilla-149 performs close to cell A (chromium-fourier), the verdict becomes “the benefit was version-level, not patch-level.” If cell E is close to cell B (Brave-147), the verdict strengthens to “patches matter, not version.” If cell E is somewhere in between, you have a partial attribution.

This needs only one additional cell at N=3, targeting exclusively the browser_cpu_median and fps metrics (kwin_cpu and GPU freq were not the primary indicators for cell B). The campaign infrastructure (orchestrator script, test rig) is already in place; the only new work is producing the binary.

Nothing else in the open-question list is cheaper or higher-signal for the stated campaign question.

Table of Contents