ohm_gl_fix:phase4_2026-04-30
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revision | |||
| ohm_gl_fix:phase4_2026-04-30 [2026/05/01 12:39] – Backfill original Phase 4 content (audit trail) markus_fritsche | ohm_gl_fix:phase4_2026-04-30 [2026/05/01 13:08] (current) – rewrap paragraphs (DokuWiki single-newline fix) markus_fritsche | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | > **2026-05-01: | + | > **2026-05-01: |
| - | > [[ohm_gl_fix: | + | |
| - | > below for audit trail; the live Phase 4 plan is the new page. | + | |
| ====== Phase 4 — The Gap (ohm_gl_fix iteration 1) ====== | ====== Phase 4 — The Gap (ohm_gl_fix iteration 1) ====== | ||
| - | > **Status — replaces the SUPERSEDED 2026-04-30 libplacebo-cache | + | > **Status — replaces the SUPERSEDED 2026-04-30 libplacebo-cache draft. The earlier plan is in git history; it was retracted after '' |
| - | > draft. The earlier plan is in git history; it was retracted after | + | |
| - | > '' | + | |
| - | > the patched code path was not being executed.** | + | |
| - | This page is the campaign' | + | This page is the campaign' |
| - | identification, | + | |
| - | fix-surface assessment. It is **not** a player recommendation, | + | |
| - | a per-player patch plan, and not a workaround proposal. Per | + | |
| - | Markus' | + | |
| - | which makes (a) VLC unusable and (b) facilitates all programs | + | |
| - | outputting video to run efficiently, | + | |
| - | campaign' | + | |
| ---- | ---- | ||
| Line 23: | Line 11: | ||
| ===== 1. Goal (essence) and use-case scope ===== | ===== 1. Goal (essence) and use-case scope ===== | ||
| - | The Phase 1 measurable target ("on '' | + | The Phase 1 measurable target ("on '' |
| - | '' | + | |
| - | '' | + | |
| - | here. The campaign' | + | |
| - | > Identify the structural gap such that filling it would lift VLC | + | > Identify the structural gap such that filling it would lift VLC out of unusability **and** would let any video-displaying program on Mali-G52 + KWin Wayland (web browsers especially) run with efficient HW-accelerated playback against stock libraries. |
| - | > out of unusability **and** would let any video-displaying program | + | |
| - | > on Mali-G52 + KWin Wayland (web browsers especially) run with | + | |
| - | > efficient HW-accelerated playback against stock libraries. | + | |
| - | The deeper framing, accepted 2026-04-30: **the predicament is the | + | The deeper framing, accepted 2026-04-30: **the predicament is the buffer-to-display path without CPU copy.** Decode is not the issue — hantro-VPU on RK3566 (and rkvdec on RK3588) can decode H.264 1080p with substantial headroom, and libva can produce ≈300 fps of decoded buffers with '' |
| - | buffer-to-display path without CPU copy.** Decode is not the | + | |
| - | issue — hantro-VPU on RK3566 (and rkvdec on RK3588) can decode | + | |
| - | H.264 1080p with substantial headroom, and libva can produce ≈300 | + | |
| - | fps of decoded buffers with '' | + | |
| - | post-decode handoff: when the decoded dmabuf needs to land on | + | |
| - | the screen and there is no zero-copy path, every consumer in the | + | |
| - | chain breaks. | + | |
| - | **In-scope use cases** (this informs how fix-surface rows are | + | **In-scope use cases** (this informs how fix-surface rows are ranked in §6): |
| - | ranked in §6): | + | |
| - | * YouTube in Brave (Chromium-based browser video, the highest-traffic | + | * YouTube in Brave (Chromium-based browser video, the highest-traffic workload on this device class). |
| - | | + | * General web browsing in Brave (compositor-side video / animations / WebGL). |
| - | * General web browsing in Brave (compositor-side video / animations | + | * VS Code (Electron + Chromium under the hood; same compositor pipeline as Brave). |
| - | | + | |
| - | * VS Code (Electron + Chromium under the hood; same compositor | + | |
| - | | + | |
| **Explicitly out of scope:** | **Explicitly out of scope:** | ||
| - | * 3D games / Tux Racer / Doom / GTA / Proton/DXVK / general-purpose | + | * 3D games / Tux Racer / Doom / GTA / Proton/DXVK / general-purpose Vulkan workloads. Software-emulated mandatory-1.2 entry points with poor performance characteristics are acceptable as long as they don't degrade the in-scope use cases. |
| - | | + | * mpv / ffplay / VLC as primary daily players. They appear as symptoms in §4 because they' |
| - | | + | |
| - | | + | |
| - | * mpv / ffplay / VLC as primary daily players. They appear as | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | Phase 1 metric, refined: a single gap is named; every observed | + | Phase 1 metric, refined: a single gap is named; every observed symptom (mpv, ffplay, VLC, Chromium-via-VAAPI, |
| - | symptom (mpv, ffplay, VLC, Chromium-via-VAAPI, | + | |
| - | regression) is traced to it; a fix-surface assessment names the | + | |
| - | shape of work that would actually close it, ranked by impact on | + | |
| - | the in-scope use cases. Anything that closes fewer than all | + | |
| - | listed symptoms is a workaround for the scope-out symptoms; | + | |
| - | //for the in-scope use cases// a partial fix may still be a | + | |
| - | proper fix. | + | |
| ===== 2. The gap, in one paragraph ===== | ===== 2. The gap, in one paragraph ===== | ||
| - | There is no completed integration of "V4L2 stateless decode → | + | There is no completed integration of "V4L2 stateless decode → GPU-displayable surface" |
| - | GPU-displayable surface" | + | |
| - | SBCs running mainline Wayland — //outside// the GStreamer | + | |
| - | '' | + | |
| - | other client of the V4L2 stateless decoder (libavcodec hwaccels, | + | |
| - | libva-v4l2-request, | + | |
| - | hwdec) inherits a different incompleteness on its specific chain, | + | |
| - | and each chain' | + | |
| - | absence of //any one// completed end-to-end path through libavcodec or | + | |
| - | libva is the structural gap. There is no single missing function or | + | |
| - | typo'd condition that, if fixed, would lift every symptom — what is | + | |
| - | missing is one **completed integration story** that the libavcodec | + | |
| - | and libva ecosystems can both navigate without depending on | + | |
| - | infrastructure (Vulkan, DRM master) that aarch64 + Wayland clients | + | |
| - | do not have. | + | |
| ===== 3. Why this is one gap and not N independent bugs ===== | ===== 3. Why this is one gap and not N independent bugs ===== | ||
| - | A naive read says "four players failed for four different reasons" | + | A naive read says "four players failed for four different reasons" |
| - | That is true at the file:line level — see the symptom inventory | + | |
| - | below — but every chain, when traced upward, terminates at the | + | |
| - | same architectural decision: //the assumption that a hardware | + | |
| - | video-decode pipeline ends in either Vulkan or DRM-master access.// | + | |
| - | Both assumptions match desktop-class hardware (Intel/ | + | |
| - | on a TTY-anchored X11 or KMS direct-scanout client). | + | |
| - | Both assumptions break on: | + | |
| - | * **Mali-G52 / Bifrost gen 2** — panvk Vulkan implementation gap. | + | * **Mali-G52 / Bifrost gen 2** — panvk Vulkan implementation gap. Not "no '' |
| - | | + | * **KWin Wayland** (and Mutter, sway, river — every Wayland compositor, by Wayland-spec design) — clients do not get DRM master. Anything in the stack that reaches for '' |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | * **KWin Wayland** (and Mutter, sway, river — every Wayland | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | The "one path that works" — '' | + | The "one path that works" — '' |
| - | '' | + | |
| - | own pipeline-level dmabuf negotiation (the `caps = | + | |
| - | video/ | + | |
| - | during Finding 6 probing) that bypasses both libavcodec hwaccels | + | |
| - | //and// libva entirely. The compositor accepts the dmabuf via the | + | |
| - | linux-dmabuf-v1 protocol. No Vulkan, no DRM master, no library | + | |
| - | chain involving libavcodec hwaccel display. **That path was | + | |
| - | designed for this hardware class. The libavcodec/ | + | |
| - | not.** | + | |
| ===== 4. Symptom inventory ===== | ===== 4. Symptom inventory ===== | ||
| - | All measured 2026-04-30 on ohm (PineTab2, RK3566, Mali-G52, hantro | + | All measured 2026-04-30 on ohm (PineTab2, RK3566, Mali-G52, hantro VPU, kernel 6.19.10, mesa 26.0.5, KWin 6.6.4, Plasma 6.6.4) playing '' |
| - | VPU, kernel 6.19.10, mesa 26.0.5, KWin 6.6.4, Plasma 6.6.4) playing | + | |
| - | '' | + | |
| ^ # ^ Client ^ Decode path attempted ^ What broke ^ Evidence ^ | ^ # ^ Client ^ Decode path attempted ^ What broke ^ Evidence ^ | ||
| Line 156: | Line 54: | ||
| | S5 | gst v4l2slh264dec → waylandsink (the " | | S5 | gst v4l2slh264dec → waylandsink (the " | ||
| - | S5 is included not as a failure of the " | + | S5 is included not as a failure of the " |
| - | evidence that even that path is fragile under the marfrit-packages | + | |
| - | custom-stack drift Markus already maintains (mesa, ffmpeg, alsa, | + | |
| - | libdrm-pinebookpro). The gap analysis below does not attempt to | + | |
| - | explain S5; it is recorded here as a known follow-up. | + | |
| ===== 5. Trace from each symptom to the gap ===== | ===== 5. Trace from each symptom to the gap ===== | ||
| - | * **S1 (mpv).** mpv assumed "if libavcodec produces drm_prime | + | * **S1 (mpv).** mpv assumed "if libavcodec produces drm_prime frames, the VO can ingest them via the drmprime hwdec interop, whose loader can get a DRM fd from the native display." |
| - | | + | * **S2 (ffplay).** libavcodec n8.x's v4l2request hwaccel was wired to require libplacebo' |
| - | | + | * **S3 (VLC).** VLC's Arch package was built '' |
| - | | + | * **S4 (Chromium).** libva-v4l2-request was written for sunxi-cedrus and never completed a multiplanar port for the kernel-mainline V4L2 stateless decoders that ship on Rockchip / NXP / RK35xx hardware. The integration assumption //" |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | * **S2 (ffplay).** libavcodec n8.x's v4l2request hwaccel was wired | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | * **S3 (VLC).** VLC's Arch package was built '' | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | * **S4 (Chromium).** libva-v4l2-request was written for sunxi-cedrus | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | The four assumptions are independent at the code level; they are | + | The four assumptions are independent at the code level; they are unified at the // |
| - | unified at the // | + | |
| - | projects involved (libavcodec, | + | |
| - | VLC) carries primary responsibility for the integration as a whole. | + | |
| - | That is the gap. | + | |
| ===== 6. Fix surface candidates ===== | ===== 6. Fix surface candidates ===== | ||
| - | Each row below describes a //direction in which a fix could live//. | + | Each row below describes a //direction in which a fix could live//. None is proposed by this campaign — Markus' |
| - | None is proposed by this campaign — Markus' | + | |
| - | specifically tasked" | + | |
| - | this Phase 4 documents rather than picks. Tractability assessments | + | |
| - | are rough. | + | |
| ^ Direction ^ What it would lift ^ Tractability ^ Where the work lives ^ | ^ Direction ^ What it would lift ^ Tractability ^ Where the work lives ^ | ||
| Line 219: | Line 76: | ||
| | **D. Compositor-level DRM-shim for Wayland clients** | S1 (mpv specifically — drmprime-overlay would get its drm_params_v2) | Medium-low. Would need a Wayland protocol extension that grants clients enough KMS view to satisfy drmprime-overlay without granting full DRM master. KWin or wlroots would have to participate, | | **D. Compositor-level DRM-shim for Wayland clients** | S1 (mpv specifically — drmprime-overlay would get its drm_params_v2) | Medium-low. Would need a Wayland protocol extension that grants clients enough KMS view to satisfy drmprime-overlay without granting full DRM master. KWin or wlroots would have to participate, | ||
| - | **No row above lifts every listed symptom.** A fix that lifts S3 | + | **No row above lifts every listed symptom.** A fix that lifts S3 specifically requires a downstream packaging change at the distro level (rebuild VLC against current ffmpeg with libplacebo enabled, or wait for VLC 4.x to land in stable Arch) — not something any of A-D upstream projects would deliver. This is part of the gap's shape: the symptom set is //not// uniformly fixable from any single location, because the integration that's missing was always going to require coordination across libavcodec, libva, the libplacebo chain, the compositor, and downstream packagers. |
| - | specifically requires a downstream packaging change at the distro | + | |
| - | level (rebuild VLC against current ffmpeg with libplacebo enabled, | + | |
| - | or wait for VLC 4.x to land in stable Arch) — not something any of | + | |
| - | A-D upstream projects would deliver. This is part of the gap's | + | |
| - | shape: the symptom set is //not// uniformly fixable from any single | + | |
| - | location, because the integration that's missing was always going | + | |
| - | to require coordination across libavcodec, libva, the libplacebo | + | |
| - | chain, the compositor, and downstream packagers. | + | |
| ==== Ranking against the §1 in-scope use cases ==== | ==== Ranking against the §1 in-scope use cases ==== | ||
| - | The four rows lift different symptoms; "most symptoms" | + | The four rows lift different symptoms; "most symptoms" |
| - | right ranking metric for the in-scope use cases. Brave / Chromium | + | |
| - | video decode goes through `Chromium VaapiVideoDecoder → libva → | + | |
| - | libva-v4l2-request`, | + | |
| - | + Vulkan/GL. So the libplacebo-chain fixes (B, C1, C2) lift mpv and | + | |
| - | ffplay, //which Markus does not use//, and do not touch Brave' | + | |
| - | decode pipeline. | + | |
| Use-case-ranked: | Use-case-ranked: | ||
| - | - **Row A (libva-v4l2-request multiplanar port)** is the only | + | - **Row A (libva-v4l2-request multiplanar port)** is the only row that lifts S4 (browser HW video decode). YouTube in Brave is in S4. Without A, browser HW decode does not engage; Brave / VS Code / Chromium fall to libavcodec SW decode, which defeats the buffer-to-display predicament for the highest-traffic workload on this device class. |
| - | | + | - **Row C2 (Vulkan layer, in-tree-of-its-own-repo)** is a smaller, self-contained engineering effort that lifts S1+S2. Worth a feasibility test (build the layer, see whether ffplay completes a 60 s playback) because the cost is bounded and the result informs whether C-class fixes are tractable in general. **Does not address the in-scope use cases directly**, since browsers don't traverse this chain — but useful as a vehicle for characterising what panvk-v7 actually does and doesn' |
| - | | + | - **Row B (libavcodec drm_prime → linux-dmabuf-v1 path)** is architecturally cleanest and would generalise across consumers, but the work lives in FFmpeg upstream and would not be in-scope-impactful unless Brave eventually changed its decode pipeline to consume libavcodec hwaccels (it currently does not). |
| - | | + | - **Row C1 (Mesa upstream panvk/v7 promotion)** is the same scope as C2 but at higher cost and with longer wall-clock until it lands in stock packages. Lower priority than C2 for the same symptom set. |
| - | | + | - **Row D (compositor DRM-shim)** is brittle, narrow, and lifts only S1. |
| - | | + | |
| - | - **Row C2 (Vulkan layer, in-tree-of-its-own-repo)** is a smaller, | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | - **Row B (libavcodec drm_prime → linux-dmabuf-v1 path)** is | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | - **Row C1 (Mesa upstream panvk/v7 promotion)** is the same scope | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | - **Row D (compositor DRM-shim)** is brittle, narrow, and lifts | + | |
| - | | + | |
| - | **The campaign as documented does not pick.** This ranking | + | **The campaign as documented does not pick.** This ranking informs which row(s) would be worth the next engineering investment //if// a fix is to be enacted; that decision is Markus' |
| - | informs which row(s) would be worth the next engineering investment | + | |
| - | //if// a fix is to be enacted; that decision is Markus' | + | |
| - | document' | + | |
| ===== 7. What this campaign deliberately does NOT do ===== | ===== 7. What this campaign deliberately does NOT do ===== | ||
| - | * **Does not pick or propose a patch.** A and B in §6 are both | + | * **Does not pick or propose a patch.** A and B in §6 are both reasonable; neither is enacted here. |
| - | | + | * **Does not recommend a player.** "Use gst-play-1.0" |
| - | * **Does not recommend a player.** "Use gst-play-1.0" | + | * **Does not patch any single player as '' |
| - | | + | * **Does not investigate the S5 regression.** Stack drift between fourier 2026-04-24 (0/62) and ohm_gl_fix 2026-04-30 (~0.3 drops/sec) is a separate concern. Likely candidates within marfrit-packages' |
| - | | + | |
| - | * **Does not patch any single player as '' | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | * **Does not investigate the S5 regression.** Stack drift between | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| ===== 8. Phase 1 metric — refined ===== | ===== 8. Phase 1 metric — refined ===== | ||
| - | * **Original** (locked 2026-04-30 morning): "on | + | * **Original** (locked 2026-04-30 morning): "on '' |
| - | | + | * **Refined** (locked 2026-04-30 evening, after the perf invalidation and Markus reframe): "the structural gap is named; every Phase 6 symptom (mpv, ffplay, VLC, browser HW decode, the gst regression) is traced to the gap with file:line evidence; a fix-surface assessment names what work would actually close it; the campaign ships documentation, |
| - | | + | |
| - | * **Refined** (locked 2026-04-30 evening, after the perf invalidation | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | '' | + | '' |
| - | that opened the campaign. '' | + | |
| - | historical context but no longer drives the campaign' | + | |
| - | criterion. The success criterion now is qualitative — the gap | + | |
| - | identification — and is verified at Phase 7 by review of this | + | |
| - | document against a second pair of eyes. | + | |
| ===== 9. References ===== | ===== 9. References ===== | ||
| - | * '' | + | * '' |
| - | | + | * '' |
| - | * '' | + | * '' |
| - | | + | |
| - | * '' | + | |
| - | | + | |
| * '' | * '' | ||
| - | * '' | + | * '' |
| - | | + | * fourier '' |
| - | | + | * DokuWiki: '' |
| - | * fourier '' | + | |
| - | | + | |
| - | * DokuWiki: '' | + | |
| - | | + | |
| * Dev process: '' | * Dev process: '' | ||
ohm_gl_fix/phase4_2026-04-30.1777639179.txt.gz · Last modified: by markus_fritsche
