| Next revision | Previous revision |
| ohm_gl_fix:phase4_2026-05-01 [2026/05/01 11:52] – Phase 4 from scratch — libva-v4l2-request multiplanar port plan markus_fritsche | ohm_gl_fix:phase4_2026-05-01 [2026/05/01 18:00] (current) – Step 0 finding folded in: Step 2 confirmed needed (KWin advertises wp_fractional_scale_manager_v1 → ShouldUseOverlayDelegation forced false) markus_fritsche |
|---|
| ====== ohm_gl_fix — Phase 4, 2026-05-01 ====== | ====== ohm_gl_fix — Phase 4, 2026-05-01 ====== |
| |
| This page replaces both prior Phase 4 drafts: the original | This page replaces both prior Phase 4 drafts: the original [[ohm_gl_fix:phase4_2026-04-30|libplacebo fd-cache plan]] (retracted after ''perf record'' showed libplacebo at 0.41 % of CPU and the patched code path not on the hot path) and its in-place revision into a "documentation of the gap" page. Phase 4 is now a //plan//, not an enumeration. It picks one fix surface, names the implementation, states what gets measured at Phase 7, and identifies the loopback edges. |
| [[ohm_gl_fix:phase4_2026-04-30|libplacebo fd-cache plan]] (retracted | |
| after ''perf record'' showed libplacebo at 0.41 % of CPU and the patched | |
| code path not on the hot path) and its in-place revision into a | |
| "documentation of the gap" page. Phase 4 is now a //plan//, not an | |
| enumeration. It picks one fix surface, names the implementation, states | |
| what gets measured at Phase 7, and identifies the loopback edges. | |
| |
| The driver of this rewrite: Phase 1 was refined on 2026-05-01 with | The driver of this rewrite: Phase 1 was refined on 2026-05-01 with machine-readable criteria ([[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §4 — C1 drops, C2 LLC-load-misses, C3 DRM_IOCTL/sec, C4 boundary fd-passing) and Phase 3 was rebuilt on the same day with empirically-grounded boundary characterisation ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §3, §4). With both anchors in place, Phase 4 can commit. |
| machine-readable criteria | |
| ([[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §4 — C1 | > **2026-05-01 amendment** (post-[[ohm_gl_fix:phase5_review_2026-05-01|Phase 5 review]]): Q1 (Brave's V4L2VideoDecoder reachability) closed by the ''strings /opt/brave-bin/brave'' deep-dive. ''UseChromeOSDirectVideoDecoder'' / ''V4L2FlatStatelessVideoDecoder'' / ''V4L2StatelessVideoDecoder'' / ''V4L2H264Decoder'' all return **0 matches** in this Brave build (Arch Linux ARM brave-bin, 2026-04-30). The single ''V4L2VideoDecoder'' string match is vestigial; all actual V4L2 source-line strings in the binary are camera-capture (''v4l2_capture_delegate.cc'', ''libtegrav4l2.so''), not video-decode. The V4L2 direct-decode path is **not compiled in** for this build, so fix surface A (libva-v4l2-request multiplanar) stands. Q2 (Step 0 methodology fix) and Q3 (Step 0.5 kernel UAPI surface audit + R1 trigger revision) are folded into §3 and §6 below. Q4 (test corpus extension) lives in [[ohm_gl_fix:phase4_step1_test_corpus_2026-05-01]]. |
| drops, C2 LLC-load-misses, C3 DRM_IOCTL/sec, C4 boundary fd-passing) and | |
| Phase 3 was rebuilt on the same day with empirically-grounded boundary | > **2026-05-01 Step 0 finding** (Phase 6 dipping): Step 2 is **confirmed needed**, not conditional. Chromium M138's overlay-delegation gate at ''ui/ozone/platform/wayland/host/wayland_connection.cc'' ''ShouldUseOverlayDelegation()'' lines 495-509 includes the predicate ''!fractional_scale_manager_v1()'' — KWin advertises ''wp_fractional_scale_manager_v1'' (verified empirically in mpv's verbose log), so the predicate returns false unconditionally on KWin Wayland regardless of feature flag. Step 2 patch site is now named with file:line. Step 0 details: [[ohm_gl_fix:phase6_step0_chromium_wayland_routing_2026-05-01|phase6/step0_chromium_wayland_routing_2026-05-01]] (companion: [[ohm_gl_fix:phase6_step0_5_uapi_audit_2026-05-01|phase6/step0_5_uapi_audit_2026-05-01]]). |
| characterisation | |
| ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §3, §4). With | |
| both anchors in place, Phase 4 can commit. | |
| |
| ===== 1. What this Phase 4 is targeting ===== | ===== 1. What this Phase 4 is targeting ===== |
| |
| [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §2 named the | [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §2 named the in-scope workloads: |
| in-scope workloads: | |
| |
| * YouTube / HTML5 ''<video>'' in Brave | * YouTube / HTML5 ''<video>'' in Brave |
| ''VaapiVideoDecoder → libva → libva-v4l2-request → V4L2 stateless'' | ''VaapiVideoDecoder → libva → libva-v4l2-request → V4L2 stateless'' |
| |
| This is //not// the libavcodec hwaccel chain that mpv, ffplay, and VLC | This is //not// the libavcodec hwaccel chain that mpv, ffplay, and VLC use. Browsers vendor their own ffmpeg fork and gate hardware video decode through libva. Therefore: the fix surfaces from the prior Phase 4 enumeration that touch libavcodec (B "''libavcodec drm_prime → linux-dmabuf-v1''") or libplacebo (C2 "''panvk-1.2-fakeshim''") **do not lift the in-scope use cases**, however structurally clean they look in isolation. The empirical entrypoint for Brave is libva, and libva on this hardware fails at ''vaInitialize'' ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §1, §8; also fourier ''README'' L236-281). |
| use. Browsers vendor their own ffmpeg fork and gate hardware video | |
| decode through libva. Therefore: the fix surfaces from the prior | |
| Phase 4 enumeration that touch libavcodec (B "''libavcodec drm_prime → | |
| linux-dmabuf-v1''") or libplacebo (C2 "''panvk-1.2-fakeshim''") **do | |
| not lift the in-scope use cases**, however structurally clean they | |
| look in isolation. The empirical entrypoint for Brave is libva, and | |
| libva on this hardware fails at ''vaInitialize'' | |
| ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §1, §8; | |
| also fourier ''README'' L236-281). | |
| |
| Phase 4 commits to **fix surface A: libva-v4l2-request multiplanar port** | Phase 4 commits to **fix surface A: libva-v4l2-request multiplanar port** as the primary direction, with an explicit pre-implementation research step (Step 0) that may discover the campaign needs a follow-up Chromium-side patch. |
| as the primary direction, with an explicit pre-implementation research | |
| step (Step 0) that may discover the campaign needs a follow-up | |
| Chromium-side patch. | |
| |
| ===== 2. Decision rationale ===== | ===== 2. Decision rationale ===== |
| Three reasons to commit to A specifically: | Three reasons to commit to A specifically: |
| |
| - **It is the only fix surface that touches Brave's actual chain.** | - **It is the only fix surface that touches Brave's actual chain.** B (libavcodec) and C2 (libplacebo Vulkan layer) target consumers Markus does not use. D (compositor DRM-shim) is a Wayland-protocol proposal that does not exist upstream and would not survive a Phase 5 review. |
| B (libavcodec) and C2 (libplacebo Vulkan layer) target consumers | - **Substantial groundwork exists.** fourier's local [[https://github.com/bootlin/libva-v4l2-request|libva-v4l2-request]] patches (on ohm at ''~/fourier-test/libva-patches/fourier-local.patch'') already get the bootlin source past format enumeration on the multiplanar hantro device (fourier ''README'' L240-256). The starting point is not "from zero" — it is "from probe-passing, multiplanar buffer setup still single-plane". |
| Markus does not use. D (compositor DRM-shim) is a Wayland-protocol | - **It addresses the structural gap, not the symptom.** Phase 1 revised's criteria all hold globally for libva consumers once A is delivered, not just for one application. fourier already flagged this as the right axis ("//browser HW video decode on ohm is parked until a multiplanar libva-v4l2-request rework exists, either ours or someone else's//", fourier ''README'' L276-281). |
| proposal that does not exist upstream and would not survive a | |
| Phase 5 review. | |
| - **Substantial groundwork exists.** fourier's local | |
| [[https://github.com/bootlin/libva-v4l2-request|libva-v4l2-request]] | |
| patches (on ohm at ''~/fourier-test/libva-patches/fourier-local.patch'') | |
| already get the bootlin source past format enumeration on the | |
| multiplanar hantro device (fourier ''README'' L240-256). The | |
| starting point is not "from zero" — it is "from probe-passing, | |
| multiplanar buffer setup still single-plane". | |
| - **It addresses the structural gap, not the symptom.** Phase 1 | |
| revised's criteria all hold globally for libva consumers once A | |
| is delivered, not just for one application. fourier already | |
| flagged this as the right axis ("//browser HW video decode on | |
| ohm is parked until a multiplanar libva-v4l2-request rework | |
| exists, either ours or someone else's//", fourier ''README'' | |
| L276-281). | |
| |
| Note explicitly: A alone may not suffice. Once the libva chain | Note explicitly: A alone may not suffice. Once the libva chain produces a NV12 dmabuf for Brave's ''VaapiVideoDecoder'', the **display side** — Chromium's GPU-process compositor — still has to present that dmabuf without per-frame Mesa GL+DRM round-trips (Phase 1 revised's C3, ≤100 DRM_IOCTL/sec). Whether Chromium does this on Wayland today, or needs an additional patch, is the open question Step 0 below answers before code is written. |
| produces a NV12 dmabuf for Brave's ''VaapiVideoDecoder'', the | |
| **display side** — Chromium's GPU-process compositor — still has to | |
| present that dmabuf without per-frame Mesa GL+DRM round-trips | |
| (Phase 1 revised's C3, ≤100 DRM_IOCTL/sec). Whether Chromium does | |
| this on Wayland today, or needs an additional patch, is the open | |
| question Step 0 below answers before code is written. | |
| |
| ===== 3. Implementation plan ===== | ===== 3. Implementation plan ===== |
| ==== Step 0 — Research: characterise Chromium's Wayland video presentation path ==== | ==== Step 0 — Research: characterise Chromium's Wayland video presentation path ==== |
| |
| **Duration:** 3–7 days. **Output:** decision document attached to | **Duration:** 3–7 days. **Output:** decision document attached to this Phase 4 plan, naming whether Step 2 is required. |
| this Phase 4 plan, naming whether Step 2 is required. | |
| |
| Question to answer: when ''VaapiVideoDecoder'' produces a | Question to answer: when ''VaapiVideoDecoder'' produces a ''NativePixmap'' (= dmabuf-backed VA-API surface) on ''chrome --ozone-platform=wayland'', does Chromium's GPU process present it via ''zwp_linux_dmabuf_v1'' subsurface (Wayland direct overlay) or via Skia GL composite onto the page's main surface? |
| ''NativePixmap'' (= dmabuf-backed VA-API surface) on | |
| ''chrome --ozone-platform=wayland'', does Chromium's GPU process | |
| present it via ''zwp_linux_dmabuf_v1'' subsurface (Wayland direct | |
| overlay) or via Skia GL composite onto the page's main surface? | |
| |
| Concrete sub-tasks: | Concrete sub-tasks: |
| |
| - **Source archaeology** in Chromium (current Brave-bin's underlying | - **Source archaeology** in Chromium (current Brave-bin's underlying Chromium version, likely M138-class): |
| Chromium version, likely M138-class): | * ''ui/ozone/platform/wayland/host/wayland_buffer_manager_host.cc'' and surrounding files — Wayland buffer attachment. |
| * ''ui/ozone/platform/wayland/host/wayland_buffer_manager_host.cc'' | * ''components/viz/service/display_embedder/'' — overlay candidate surface processing. |
| and surrounding files — Wayland buffer attachment. | |
| * ''components/viz/service/display_embedder/'' — overlay candidate | |
| surface processing. | |
| * ''media/gpu/vaapi/'' — VA-API surface to native-pixmap conversion. | * ''media/gpu/vaapi/'' — VA-API surface to native-pixmap conversion. |
| * ''gpu/ipc/service/gpu_video_decode_accelerator_helpers.cc'' — | * ''gpu/ipc/service/gpu_video_decode_accelerator_helpers.cc'' — dmabuf flow from decoder to compositor. |
| dmabuf flow from decoder to compositor. | - **Static source trace** (replaces the SW-decode synthesis test that was here in the pre-Phase-5 draft — Phase 5 reviewer flagged it as broken-by-design: SW-decode produces shmem buffers not ''NativePixmap'' dmabufs, so the test cannot validate whether a hardware-decode ''NativePixmap'' would be routed via ''zwp_linux_dmabuf_v1''). Trace the path ''VaapiPicture / VaapiPictureNativePixmapOzone → NativePixmap → GpuMemoryBuffer → SharedImageBacking → wayland_buffer_manager_host'' in Chromium M138-class. Determine **statically** whether the subsurface path is gated on ''GpuMemoryBufferType == NATIVE_PIXMAP'' or on some other condition. Cite source ''file:line'' in the decision document. |
| - **Empirical synthesis test:** with current Brave (libva broken), | - **Stub libva driver test (optional, only if static analysis is inconclusive).** Build a stub libva backend that returns a valid ''NativePixmap'' backed by a linear dma-heap allocation (no hantro needed). Run Brave with ''LIBVA_DRIVER_NAME'' pointing at the stub. Observe whether the GPU process emits ''PRIME_FD_TO_HANDLE'' or ''SCM_RIGHTS'' on the Wayland socket. This isolates the compositor routing question from the decode question. |
| can we coax Chromium into the dmabuf-overlay path using a | - **Feature flag inventory:** check ''chrome://flags'' and ''--enable-features='' for relevant entries: ''VaapiVideoDecoder'', ''VaapiVideoDecodeLinuxGL'', ''UseChromeOSDirectVideoDecoder'', ''UseDelegatedCompositing'', ''DelegatedCompositingLimitToUi'', ''AcceleratedVideoDecodeLinuxGL'', ''wayland-screen-coordinates'', ''ozone-overlay-priority-hint''. |
| different content source — e.g. WebGL canvas, or a video element | |
| with software decode where the decoded YUV is uploaded once to | |
| a GL texture and we observe whether composite uses the texture | |
| via Wayland subsurface or via Skia main-surface compositing? | |
| Look at ''DRM_IOCTL_*'' rate and ''SCM_RIGHTS'' fd-passing on the | |
| GPU process (already instrumented in | |
| [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §3). | |
| - **Feature flag inventory:** check ''chrome://flags'' and | |
| ''--enable-features='' for relevant entries: | |
| ''VaapiVideoDecoder'', ''VaapiVideoDecodeLinuxGL'', | |
| ''UseChromeOSDirectVideoDecoder'', ''UseDelegatedCompositing'', | |
| ''DelegatedCompositingLimitToUi'', ''AcceleratedVideoDecodeLinuxGL'', | |
| ''wayland-screen-coordinates'', ''ozone-overlay-priority-hint''. | |
| |
| **Output gate:** decision document records whether Chromium's GPU | **Output gate:** decision document records whether Chromium's GPU process under default flags will route a working VA-API dmabuf to ''zwp_linux_dmabuf_v1'' (Step 2 not needed) or composite via Skia GL (Step 2 needed) — **with the source ''file:line''** that creates the Wayland buffer for a VA-API ''NativePixmap'' explicitly cited (per Phase 5 review Q2 output gate). The decision document attaches to this Phase 4 page after Step 0 completes. |
| process under default flags will route a working VA-API dmabuf to | |
| ''zwp_linux_dmabuf_v1'' (Step 2 not needed) or composite via Skia GL | ==== Step 0.5 — Kernel UAPI surface audit ==== |
| (Step 2 needed). The decision document attaches to this Phase 4 | |
| page after Step 0 completes. | **Duration:** 1–2 days. **Output:** documented control-structure layout that the hantro driver actually consumes. Inserted post-Phase-5-review per Q3 — the V4L2 stateless request-API control payload format on hantro G1/G2 (RK3566) is poorly documented in UAPI headers alone, and a control-payload mismatch produces silent black-frame failures rather than ''EINVAL''. fourier's local libva-v4l2-request patches were validated against the GStreamer codepath's buffer-management model, not libva's allocation model, so they don't pre-empt the question. |
| | |
| | Concrete sub-tasks: |
| | |
| | - ''strace -f -e trace=ioctl -e signal=none -o /tmp/gst_h264.strace gst-launch-1.0 -q filesrc location=bbb_1080p30_h264.mp4 \! qtdemux \! h264parse \! v4l2slh264dec \! fakesink''. If strace truncates the embedded payload-data field, fall back to ''ftrace'' tracepoints on ''vidioc_*'' for fuller capture. |
| | - Extract the exact byte payload of ''VIDIOC_S_EXT_CTRLS'' calls for one I-frame and one P-frame. |
| | - Compare byte-for-byte against the kernel header ''include/uapi/linux/v4l2-controls.h'' ''V4L2_CID_STATELESS_H264_*'' structs (specifically ''V4L2_CID_STATELESS_H264_DECODE_PARAMS'', ''V4L2_CID_STATELESS_H264_SLICE_PARAMS'', ''V4L2_CID_STATELESS_H264_PRED_WEIGHTS'', ''V4L2_CID_STATELESS_H264_SCALING_MATRIX'', ''V4L2_CID_STATELESS_H264_DECODE_MODE'', ''V4L2_CID_STATELESS_H264_START_CODE''). |
| | - Document the actual hantro driver control-structure layout: field ordering, padding, reference-frame DPB array conventions, ''VIDIOC_STREAMON'' sequencing relative to request fd lifecycle. |
| | |
| | **Output gate:** the documented control-structure layout serves as the per-byte template for Step 1 ''src/picture.c'' / ''src/h264.c'' work. If the layout diverges from kernel-header naive interpretation (highly likely on hantro), Step 1 starts with the actual layout, not the header layout. |
| |
| ==== Step 1 — libva-v4l2-request multiplanar port ==== | ==== Step 1 — libva-v4l2-request multiplanar port ==== |
| |
| **Duration:** 4–8 weeks of focused work; the lower end if fourier's | **Duration:** 4–8 weeks of focused work; the lower end if fourier's local patches and Phase 2 §3 substrate (9-fd capture pool, NV12 single-plane 1920×1088 ''sizeimage = 3 655 712'') generalise. The upper end if hantro's request-API control set turns out to need additional reverse-engineering against the kernel driver (''drivers/staging/media/rkvdec/'' / ''drivers/staging/media/hantro/''). |
| local patches and Phase 2 §3 substrate (9-fd capture pool, NV12 | |
| single-plane 1920×1088 ''sizeimage = 3 655 712'') generalise. The | |
| upper end if hantro's request-API control set turns out to need | |
| additional reverse-engineering against the kernel driver | |
| (''drivers/staging/media/rkvdec/'' / ''drivers/staging/media/hantro/''). | |
| |
| **Source basis:** | **Source basis:** |
| |
| * Upstream fork: [[https://github.com/bootlin/libva-v4l2-request]] | * Upstream fork: [[https://github.com/bootlin/libva-v4l2-request]] (last meaningful commit ~years ago per fourier; confirm at Step 1 start). |
| (last meaningful commit ~years ago per fourier; confirm at | * fourier local patches: ''~/fourier-test/libva-patches/fourier-local.patch'' — HEVC stripped (RK3566 has no HEVC HW), missing ''#include "utils.h"'' in ''src/h264.c'' restored, ''src/config.c'' format-enumeration extended to try both ''V4L2_BUF_TYPE_VIDEO_OUTPUT'' and ''V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE'' (fourier ''README'' L240-256). |
| Step 1 start). | |
| * fourier local patches: | |
| ''~/fourier-test/libva-patches/fourier-local.patch'' — HEVC | |
| stripped (RK3566 has no HEVC HW), missing ''#include "utils.h"'' | |
| in ''src/h264.c'' restored, ''src/config.c'' format-enumeration | |
| extended to try both ''V4L2_BUF_TYPE_VIDEO_OUTPUT'' and | |
| ''V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE'' (fourier ''README'' | |
| L240-256). | |
| |
| **Concrete work surface, in order:** | **Concrete work surface, in order:** |
| |
| - **Fork + import groundwork.** Set up | - **Fork + import groundwork.** Set up ''marfrit-packages/libva-v4l2-request-ohm-gl-fix/''. Apply fourier's patches as the patch-zero baseline. ''pkgname= libva-v4l2-request-ohm-gl-fix'', ''provides+conflicts+replaces= libva-v4l2-request''. Build via fermi (Gitea Actions runner archlinuxarm aarch64). |
| ''marfrit-packages/libva-v4l2-request-ohm-gl-fix/''. Apply | - **Multiplanar buffer setup in ''src/v4l2.c''.** Replace single-plane ''v4l2_buffer'' / ''v4l2_format'' usage with MPLANE variants (''VIDIOC_S_FMT'' on ''V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE'' for bitstream input, ''V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE'' for NV12 output; ''VIDIOC_QBUF'' / ''VIDIOC_DQBUF'' with ''planes[]'' arrays). The Phase 2 §3 strace evidence (''ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime'' producing 9 ''VIDIOC_EXPBUF''s with NV12 single-plane ''sizeimage = 3 655 712'') is the per-buffer template. |
| fourier's patches as the patch-zero baseline. ''pkgname= | - **Multiplanar context lifecycle in ''src/context.c''.** Replace ''vaCreateContext'' single-plane buffer-pool setup with multiplanar pool that mirrors the ''VIDIOC_REQBUFS+CREATE_BUFS, count=1''-loop pattern Phase 2 captured. Capture ring depth = 9 (per Phase 2 §3). Output ring (bitstream input) depth = 4. |
| libva-v4l2-request-ohm-gl-fix'', ''provides+conflicts+replaces= | - **Multiplanar slice submission in ''src/picture.c'' and ''src/h264.c''.** Adapt request-API frame submission: build ''V4L2_CTRL_*_HEADER'' control payloads (SPS, PPS, decode params, slice params, scaling matrix) attached to the request fd, ''VIDIOC_QBUF'' the bitstream input MPLANE buffer with the request fd, ''VIDIOC_DQBUF'' the capture MPLANE NV12 buffer after decode. The kernel UAPI is in ''include/uapi/linux/v4l2-controls.h'' ''V4L2_CID_STATELESS_H264_*'' (note: the older ''V4L2_CID_MPEG_VIDEO_HEVC_*'' was renamed; H264 was renamed to ''V4L2_CID_STATELESS_H264_*'' on the same wave). |
| libva-v4l2-request''. Build via fermi (Gitea Actions runner | - **NativePixmap export.** Ensure each capture-side dmabuf fd flows out of libva to the caller (Chromium's ''VaapiPicture'') as a NativePixmap with the right DRM format (''DRM_FORMAT_NV12'') and modifier (''DRM_FORMAT_MOD_LINEAR'' per Phase 3 Finding 1). Verify the modifier matches what Chromium will accept. |
| archlinuxarm aarch64). | |
| - **Multiplanar buffer setup in ''src/v4l2.c''.** Replace | |
| single-plane ''v4l2_buffer'' / ''v4l2_format'' usage with | |
| MPLANE variants (''VIDIOC_S_FMT'' on | |
| ''V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE'' for bitstream input, | |
| ''V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE'' for NV12 output; | |
| ''VIDIOC_QBUF'' / ''VIDIOC_DQBUF'' with ''planes[]'' arrays). | |
| The Phase 2 §3 strace evidence (''ffmpeg -hwaccel | |
| v4l2request -hwaccel_output_format drm_prime'' producing | |
| 9 ''VIDIOC_EXPBUF''s with NV12 single-plane ''sizeimage = | |
| 3 655 712'') is the per-buffer template. | |
| - **Multiplanar context lifecycle in ''src/context.c''.** | |
| Replace ''vaCreateContext'' single-plane buffer-pool setup | |
| with multiplanar pool that mirrors the | |
| ''VIDIOC_REQBUFS+CREATE_BUFS, count=1''-loop pattern Phase 2 | |
| captured. Capture ring depth = 9 (per Phase 2 §3). Output | |
| ring (bitstream input) depth = 4. | |
| - **Multiplanar slice submission in ''src/picture.c'' and | |
| ''src/h264.c''.** Adapt request-API frame submission: build | |
| ''V4L2_CTRL_*_HEADER'' control payloads (SPS, PPS, decode | |
| params, slice params, scaling matrix) attached to the request | |
| fd, ''VIDIOC_QBUF'' the bitstream input MPLANE buffer with the | |
| request fd, ''VIDIOC_DQBUF'' the capture MPLANE NV12 buffer | |
| after decode. The kernel UAPI is in | |
| ''include/uapi/linux/v4l2-controls.h'' | |
| ''V4L2_CID_STATELESS_H264_*'' (note: the older | |
| ''V4L2_CID_MPEG_VIDEO_HEVC_*'' was renamed; H264 was renamed | |
| to ''V4L2_CID_STATELESS_H264_*'' on the same wave). | |
| - **NativePixmap export.** Ensure each capture-side dmabuf fd | |
| flows out of libva to the caller (Chromium's | |
| ''VaapiPicture'') as a NativePixmap with the right DRM format | |
| (''DRM_FORMAT_NV12'') and modifier (''DRM_FORMAT_MOD_LINEAR'' | |
| per Phase 3 Finding 1). Verify the modifier matches what | |
| Chromium will accept. | |
| - **Test corpus.** Run against: | - **Test corpus.** Run against: |
| * ''bbb_1080p30_h264.mp4'' (the campaign's reference clip). | * ''bbb_1080p30_h264.mp4'' (the campaign's reference clip). |
| * ''vainfo'' (libva self-test) on | * ''vainfo'' (libva self-test) on ''/dev/dri/renderD128'' equivalent. |
| ''/dev/dri/renderD128'' equivalent. | * Any failure cases noted by fourier (''README'' L319-340, "test corpus" — pull list at Step 1 start). |
| * Any failure cases noted by fourier (''README'' L319-340, | - **Package + publish.** PKGBUILD finalised, builds on fermi, pushes to marfrit-packages pacman repo. |
| "test corpus" — pull list at Step 1 start). | |
| - **Package + publish.** PKGBUILD finalised, builds on fermi, | ==== Step 2 — Chromium display-side patch (confirmed needed by Step 0 finding 2026-05-01) ==== |
| pushes to marfrit-packages pacman repo. | |
| | **Status:** Step 0 found that Chromium M138's overlay-delegation system is force-disabled on KWin Wayland by a single predicate. Step 2 is no longer conditional. Trigger met. |
| | |
| | **Patch site:** ''chromium/ui/ozone/platform/wayland/host/wayland_connection.cc'' ''WaylandConnection::ShouldUseOverlayDelegation()'' lines 495-509: |
| | |
| | <code c> |
| | bool WaylandConnection::ShouldUseOverlayDelegation() const { |
| | bool should_use_overlay_delegation = |
| | IsWaylandOverlayDelegationEnabled() && !fractional_scale_manager_v1(); |
| | should_use_overlay_delegation &= !!single_pixel_buffer(); |
| | return should_use_overlay_delegation; |
| | } |
| | </code> |
| | |
| | The ''!fractional_scale_manager_v1()'' conjunct is the load-bearing fail. KWin advertises ''wp_fractional_scale_manager_v1''; the predicate is false; overlay delegation is force-disabled regardless of feature flag. |
| |
| ==== Step 2 (conditional) — Chromium display-side patch ==== | **Patch shape (recommended — minimal blast radius):** surface-state-gated relaxation. Replace ''!fractional_scale_manager_v1()'' with a check that returns true when the surface's currently-applied scale is integer (1.0, 2.0, etc.). The protocol is allowed to be advertised; we just require the relevant surface isn't *using* fractional scale right now. Preserves correctness when fractional scale IS in fact active for the surface. |
| |
| **Trigger:** Step 0 finds Chromium does not auto-route VA-API | Two alternative shapes considered and parked: drop the gate entirely and let Viz `OverlayCandidate` validators reject candidates needing viewport-subpixel destinations (bigger refactor, touches Viz code); add a feature flag bypass (crudest, relies on user to know the trade-off). See [[ohm_gl_fix:phase6_step0_chromium_wayland_routing_2026-05-01|Step 0 doc §"Patch shape"]] for full reasoning. |
| NativePixmaps through ''zwp_linux_dmabuf_v1'' on Wayland under the | |
| default feature flags — i.e. it composites via Skia GL and Phase 1 | |
| revised's C3 (≤ 100 DRM_IOCTL/sec) cannot be reached from Step 1 | |
| alone. | |
| |
| **Shape (deferred — exact scope set by Step 0):** patch Chromium | **Open Step 2 sub-task:** characterise the Viz-side per-buffer filtering (`OverlayCandidate` validation in ''components/viz/service/display/overlay_processor*.cc'') that becomes the next-level gate once stage-1 is lifted. Not blocking Step 2 implementation; needed before Phase 7 can predict whether C3 is met by patch alone or also requires a Viz tweak. |
| to route VAAPI NativePixmaps as Wayland subsurfaces for video | |
| elements; or enable a feature flag set that does this. Build as | |
| ''chromium-ohm-gl-fix'' (or ''brave-ohm-gl-fix'') on | |
| marfrit-packages. | |
| |
| If Step 0 finds Step 2 is //not// needed, Phase 4 implementation | **Build target:** ''chromium-ohm-gl-fix'' or ''brave-ohm-gl-fix'' on marfrit-packages. ABI-compatible patch (small change to one .cc); no soname change. Substantial build cost (Chromium full rebuild on aarch64 takes hours-to-days; consider building on a beefier ARM host or distcc). |
| ends at Step 1 + Step 3. | |
| |
| ==== Step 3 — Verification (Phase 7 prep) ==== | ==== Step 3 — Verification (Phase 7 prep) ==== |
| After Step 1 (and conditionally Step 2) lands on ohm: | After Step 1 (and conditionally Step 2) lands on ohm: |
| |
| - Reinstall: ''sudo pacman -U | - Reinstall: ''sudo pacman -U libva-v4l2-request-ohm-gl-fix-*.pkg.tar.zst'' (and conditionally ''chromium-ohm-gl-fix-*''). |
| libva-v4l2-request-ohm-gl-fix-*.pkg.tar.zst'' | - Re-run [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §3 v2 strace (''ioctl,mmap,munmap,sendmsg,recvmsg'') and §4 perf-stat (''cache-misses,LLC-load-misses,cycles,instructions'') on Brave + ''bbb_1080p30_h264.mp4'' over a 60 s steady-state window. Capture renderer + GPU-process targets. |
| (and conditionally ''chromium-ohm-gl-fix-*''). | |
| - Re-run [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] | |
| §3 v2 strace (''ioctl,mmap,munmap,sendmsg,recvmsg'') and §4 | |
| perf-stat (''cache-misses,LLC-load-misses,cycles,instructions'') | |
| on Brave + ''bbb_1080p30_h264.mp4'' over a 60 s steady-state | |
| window. Capture renderer + GPU-process targets. | |
| - Check Phase 1 revised C1-C4: | - Check Phase 1 revised C1-C4: |
| * **C1** drops ≤ 10 over 60 s, drops_post_warmup = 0 | * **C1** drops ≤ 10 over 60 s, drops_post_warmup = 0 |
| * **C2** LLC-load-misses ≤ 9 M / 10 s | * **C2** LLC-load-misses ≤ 9 M / 10 s |
| * **C3** DRM_IOCTL/sec ≤ 100 | * **C3** DRM_IOCTL/sec ≤ 100 |
| * **C4** at least one of (a) ''VIDIOC_EXPBUF'' + ''SCM_RIGHTS'' | * **C4** at least one of (a) ''VIDIOC_EXPBUF'' + ''SCM_RIGHTS'' OR (b) ''PRIME_FD_TO_HANDLE'' from V4L2 dmabuf observed |
| OR (b) ''PRIME_FD_TO_HANDLE'' from V4L2 dmabuf observed | |
| - Append result row(s) to ''metrics.csv'' as ''phase7_verify_*''. | - Append result row(s) to ''metrics.csv'' as ''phase7_verify_*''. |
| |
| **Touched:** | **Touched:** |
| |
| * libva-v4l2-request — substantial multiplanar rewrite of | * libva-v4l2-request — substantial multiplanar rewrite of ''src/v4l2.c'', ''src/context.c'', ''src/picture.c'', ''src/h264.c''. Public ABI preserved (libva-driver entrypoints unchanged); internal restructuring only. |
| ''src/v4l2.c'', ''src/context.c'', ''src/picture.c'', | * marfrit-packages — new ''libva-v4l2-request-ohm-gl-fix/'' tree. Conditionally: ''chromium-ohm-gl-fix/'' (Step 2 only). |
| ''src/h264.c''. Public ABI preserved (libva-driver entrypoints | * ohm system — ''pacman -U'' replaces stock libva-v4l2-request (and conditionally Chromium/Brave) with the campaign packages. |
| unchanged); internal restructuring only. | |
| * marfrit-packages — new ''libva-v4l2-request-ohm-gl-fix/'' tree. | |
| Conditionally: ''chromium-ohm-gl-fix/'' (Step 2 only). | |
| * ohm system — ''pacman -U'' replaces stock libva-v4l2-request | |
| (and conditionally Chromium/Brave) with the campaign packages. | |
| |
| **Not touched:** | **Not touched:** |
| |
| * mpv, ffplay, VLC, gst-* — these remain on their current paths. | * mpv, ffplay, VLC, gst-* — these remain on their current paths. Their users will not benefit from Phase 4. Out of campaign scope. |
| Their users will not benefit from Phase 4. Out of campaign scope. | * Mesa / panfrost / panvk / libplacebo — their state is unchanged. The ''panvk-1.2-fakeshim'' option from prior Phase 4 drafts is not pursued in this iteration. |
| * Mesa / panfrost / panvk / libplacebo — their state is unchanged. | * libavcodec / ffmpeg — Chromium statically vendors its own; the system ''ffmpeg-v4l2-request-git'' package is unchanged. |
| The ''panvk-1.2-fakeshim'' option from prior Phase 4 drafts is | * Kernel drivers (hantro-vpu, panfrost). Step 1 builds against the existing UAPI surface; no kernel work. |
| not pursued in this iteration. | * KWin / Wayland protocol. Step 1 produces dmabuf fds; existing KWin ''zwp_linux_dmabuf_v1'' implementation consumes them. No compositor work. |
| * libavcodec / ffmpeg — Chromium statically vendors its own; the | * The S5 regression ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §6 / §8 — gst-launch waylandsink ~0.3 drops/sec on today's stack vs. fourier 2026-04-24's 0/62). Separate iteration if pursued. |
| system ''ffmpeg-v4l2-request-git'' package is unchanged. | |
| * Kernel drivers (hantro-vpu, panfrost). Step 1 builds against the | |
| existing UAPI surface; no kernel work. | |
| * KWin / Wayland protocol. Step 1 produces dmabuf fds; existing | |
| KWin ''zwp_linux_dmabuf_v1'' implementation consumes them. | |
| No compositor work. | |
| * The S5 regression | |
| ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] | |
| §6 / §8 — gst-launch waylandsink ~0.3 drops/sec on today's | |
| stack vs. fourier 2026-04-24's 0/62). Separate iteration if | |
| pursued. | |
| |
| ===== 5. Predicted outcome (against Phase 1 revised C1-C4) ===== | ===== 5. Predicted outcome (against Phase 1 revised C1-C4) ===== |
| |
| If Step 0 + Step 1 deliver and Step 2 turns out unnecessary | If Step 0 + Step 1 deliver and Step 2 turns out unnecessary (optimistic case): |
| (optimistic case): | |
| |
| ^ Criterion ^ Current (Brave SW path) ^ Predicted (Phase 4 delivered) ^ How verified ^ | ^ Criterion ^ Current (Brave SW path) ^ Predicted (Phase 4 delivered) ^ How verified ^ |
| | **C4** boundary fd-passing | NO (libva fails, no V4L2 path engaged) | **YES** — ''VIDIOC_EXPBUF'' from libva, then either ''SCM_RIGHTS'' to KWin or ''PRIME_FD_TO_HANDLE'' to GL (depending on Step 2 outcome) | strace v2 boundary inspection | | | **C4** boundary fd-passing | NO (libva fails, no V4L2 path engaged) | **YES** — ''VIDIOC_EXPBUF'' from libva, then either ''SCM_RIGHTS'' to KWin or ''PRIME_FD_TO_HANDLE'' to GL (depending on Step 2 outcome) | strace v2 boundary inspection | |
| |
| If Step 2 //is// required, the same outcome but reached via Step 1 + | If Step 2 //is// required, the same outcome but reached via Step 1 + Step 2 in sequence, with Step 1's standalone result being C1+C2 met and C3+C4 partially met (Level 1 zero-copy at the decode boundary; Level 2 still not at the compositor boundary). |
| Step 2 in sequence, with Step 1's standalone result being C1+C2 met | |
| and C3+C4 partially met (Level 1 zero-copy at the decode boundary; | |
| Level 2 still not at the compositor boundary). | |
| |
| ===== 6. Risks and mitigations ===== | ===== 6. Risks and mitigations ===== |
| |
| - **R1 — Multiplanar port takes longer than 8 weeks.** V4L2 | - **R1 — Multiplanar port takes longer than 8 weeks.** V4L2 stateless API + request-API + hantro-specific control set is intricate. //Mitigation:// scope to H.264 only initially. HEVC is moot (RK3566 hantro has no HEVC HW). VP8 / VP9 / AV1 follow only if H.264 lands cleanly. **Slip trigger (revised post-Phase 5 review Q3):** any sub-task in Step 1 produces silent black frames or no decoder output for **>3 days** — that is the observable early signal of a control-payload mismatch (the most likely failure mode), and it is materially earlier than calendar-slip detection. Calendar slip alone (>3 weeks) is insufficient as a trigger because silent corruption can disguise itself as a build/integration problem for a long time. Surface either trigger to Markus for re-scoping. |
| stateless API + request-API + hantro-specific control set is | - **R2 — Chromium routes VA-API NativePixmap through Skia GL on Wayland by default** — **realised, not just risked.** Step 0 (2026-05-01) found the gating predicate at ''WaylandConnection::ShouldUseOverlayDelegation()'' line 495-509 is force-false on KWin because KWin advertises ''wp_fractional_scale_manager_v1''. Step 2 is now in scope unconditionally; see §3 Step 2 above for patch site + shape. //Mitigation status:// activated. If Step 2 itself looks >2 months (Chromium build cost dominates), reconsider whether to ship Step 1 alone with C1+C2 met and document C3 as still missing. |
| intricate. //Mitigation:// scope to H.264 only initially. HEVC | - **R3 — hantro's H.264 conformance is incomplete.** Some streams (interlaced, certain profile/level combinations, Hi10P) may fail. //Mitigation:// cross-check against fourier's ''gst v4l2slh264dec'' working output on the same clip — that path uses the same kernel driver and is a known-good reference. Use the test corpus from fourier ''README'' L319-340 once enumerated. |
| is moot (RK3566 hantro has no HEVC HW). VP8 / VP9 / AV1 follow | - **R4 — KWin's ''zwp_linux_dmabuf_v1'' modifier handling on the NV12 ''DRM_FORMAT_MOD_LINEAR'' that hantro produces.** Phase 3 Finding 1 already showed all panvk modifiers carry ''external_only=1''; that's a panvk-side property, but KWin's own modifier acceptance for NV12 is independent. //Mitigation:// cross-check by running ''gst-launch v4l2slh264dec → waylandsink'' on today's stack — that path produces the same modifier and is accepted by KWin (the S1 zero-copy reference). If S1 still works, KWin's acceptance is fine for the Step 1 output. |
| only if H.264 lands cleanly. If a single sub-task slips by >3 | - **R5 — fourier's libva-v4l2-request local patches were against an older bootlin tree.** May not apply cleanly to current upstream. //Mitigation:// start by rebasing fourier's patches on current upstream as the first sub-task of Step 1. If upstream has moved more than expected, fall back to fourier's snapshot. |
| weeks, surface to Markus for re-scoping. | - **R6 — Chromium's VAAPI gating** (''VaapiVideoDecoder'', ''VaapiIgnoreDriverChecks''). The driver-check path inspects the libva driver's reported profile set. fourier already saw ''vainfo'' enumerate H.264 profiles successfully with the probe patch; the multiplanar Step 1 should preserve that. //Mitigation:// after Step 1, re-run ''vainfo LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1'' to confirm profile enumeration still passes. Then Brave's ''--enable-features=VaapiVideoDecoder,VaapiIgnoreDriverChecks'' invocation should engage. |
| - **R2 — Chromium routes VA-API NativePixmap through Skia GL on | |
| Wayland by default** (Step 0 negative finding). //Mitigation:// | |
| Step 2 patches Chromium. Engineering cost goes up materially | |
| but campaign scope still tractable. If Step 2 itself looks | |
| >2 months, reconsider whether to ship Step 1 alone with C1+C2 | |
| met and document C3 as still missing. | |
| - **R3 — hantro's H.264 conformance is incomplete.** Some streams | |
| (interlaced, certain profile/level combinations, | |
| Hi10P) may fail. //Mitigation:// | |
| cross-check against fourier's ''gst v4l2slh264dec'' working | |
| output on the same clip — that path uses the same kernel | |
| driver and is a known-good reference. Use the test corpus from | |
| fourier ''README'' L319-340 once | |
| enumerated. | |
| - **R4 — KWin's ''zwp_linux_dmabuf_v1'' modifier handling on the | |
| NV12 ''DRM_FORMAT_MOD_LINEAR'' that hantro produces.** Phase 3 | |
| Finding 1 already showed all panvk modifiers carry | |
| ''external_only=1''; that's a panvk-side property, but KWin's | |
| own modifier acceptance for NV12 is independent. //Mitigation:// | |
| cross-check by running ''gst-launch v4l2slh264dec → waylandsink'' | |
| on today's stack — that path produces the same modifier and is | |
| accepted by KWin (the S1 zero-copy reference). If S1 still | |
| works, KWin's acceptance is fine for the Step 1 output. | |
| - **R5 — fourier's libva-v4l2-request local patches were against | |
| an older bootlin tree.** May not apply cleanly to current | |
| upstream. //Mitigation:// start by rebasing fourier's patches | |
| on current upstream as the first sub-task of Step 1. If | |
| upstream has moved more than expected, fall back to | |
| fourier's snapshot. | |
| - **R6 — Chromium's VAAPI gating** (''VaapiVideoDecoder'', | |
| ''VaapiIgnoreDriverChecks''). The driver-check path inspects | |
| the libva driver's reported profile set. fourier already saw | |
| ''vainfo'' enumerate H.264 profiles successfully with the | |
| probe patch; the multiplanar Step 1 should preserve that. | |
| //Mitigation:// after Step 1, re-run ''vainfo | |
| LIBVA_DRIVER_NAME=v4l2_request | |
| LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1'' to confirm profile | |
| enumeration still passes. Then Brave's | |
| ''--enable-features=VaapiVideoDecoder,VaapiIgnoreDriverChecks'' | |
| invocation should engage. | |
| |
| ===== 7. Phase 5 hand-over ===== | ===== 7. Phase 5 hand-over ===== |
| |
| Per ''~/.claude/projects/-home-mfritsche-src/memory/feedback_dev_process.md'', | Per ''~/.claude/projects/-home-mfritsche-src/memory/feedback_dev_process.md'', Phase 5 is second-model review of all Phase 1-4 artefacts. Markus pastes the materials uncurated: |
| Phase 5 is second-model review of all Phase 1-4 artefacts. Markus | |
| pastes the materials uncurated: | |
| |
| * [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] | * [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] |
| * [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] | * [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] |
| * **This Phase 4 page** | * **This Phase 4 page** |
| * Companion CSVs: ''metrics.csv'', | * Companion CSVs: ''metrics.csv'', ''phase3/io_cache_2026-05-01/boundary_counts.csv'', ''phase3/io_cache_2026-05-01/perfstat.csv'' |
| ''phase3/io_cache_2026-05-01/boundary_counts.csv'', | |
| ''phase3/io_cache_2026-05-01/perfstat.csv'' | |
| |
| Specific questions for the second-model reviewer to challenge: | Specific questions for the second-model reviewer to challenge: |
| |
| - **Is fix surface A actually the right pick** given Phase 1 | - **Is fix surface A actually the right pick** given Phase 1 revised's use-case priority? In particular: does the reviewer see a path Phase 4 missed where Brave's chain could be lifted without rewriting libva-v4l2-request multiplanar? |
| revised's use-case priority? In particular: does the reviewer | - **Is Step 0's research scope sufficient** to commit to or rule out Step 2 with confidence, or does Step 0 itself need a Phase 4-internal sub-plan? |
| see a path Phase 4 missed where Brave's chain could be lifted | - **Risk R1 (slip) and R2 (Step 2 needed) — is the mitigation realistic** given a single-engineer-with-Claude-assistance capacity? |
| without rewriting libva-v4l2-request multiplanar? | - **Test corpus from fourier README L319-340 — is it adequate** for declaring Step 1 complete, or should we extend it? |
| - **Is Step 0's research scope sufficient** to commit to or rule | |
| out Step 2 with confidence, or does Step 0 itself need a | |
| Phase 4-internal sub-plan? | |
| - **Risk R1 (slip) and R2 (Step 2 needed) — is the mitigation | |
| realistic** given a single-engineer-with-Claude-assistance | |
| capacity? | |
| - **Test corpus from fourier README L319-340 — is it adequate** | |
| for declaring Step 1 complete, or should we extend it? | |
| |
| ===== 8. Phase 6 (implementation) and Phase 7 (verification) order ===== | ===== 8. Phase 6 (implementation) and Phase 7 (verification) order ===== |
| |
| Phase 6 = "execute Step 0 → Step 1 → conditionally Step 2". | Phase 6 = "execute Step 0 → Step 1 → conditionally Step 2". Phase 7 = "Step 3" above. ''metrics.csv'' rows ''phase7_verify_brave_*'' will hold the binding numbers. |
| Phase 7 = "Step 3" above. ''metrics.csv'' rows | |
| ''phase7_verify_brave_*'' will hold the binding numbers. | |
| |
| Phase 6 is //long// (weeks-to-months in elapsed wall time, not | Phase 6 is //long// (weeks-to-months in elapsed wall time, not full-time). Sub-step boundaries inside Phase 6 are Phase-4-internal; no need to re-enter Phase 4 unless a step-level surprise demands re-planning (e.g. Step 0 turns up something that invalidates Step 1's direction). |
| full-time). Sub-step boundaries inside Phase 6 are Phase-4-internal; | |
| no need to re-enter Phase 4 unless a step-level surprise demands | |
| re-planning (e.g. Step 0 turns up something that invalidates Step 1's | |
| direction). | |
| |
| The three loopback edges | The three loopback edges ([[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §5): |
| ([[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] §5): | |
| |
| * C1 ✓ + C2 ✗ + C3 ✓ → flag, investigate. Surfaces a measurement | * C1 ✓ + C2 ✗ + C3 ✓ → flag, investigate. Surfaces a measurement classification issue. |
| classification issue. | * C1 ✓ + C2 ✓ + C3 ✗ → Level-1 fixed, Level-2 missing. **This is the expected post-Step-1 state if Step 0 said Step 2 is needed.** Re-enter Phase 4 with Step 2 spec'd. |
| * C1 ✓ + C2 ✓ + C3 ✗ → Level-1 fixed, Level-2 missing. **This is | * C1 ✗ at Phase 7 → drops still happen. Re-enter Phase 4 with new perf evidence. |
| the expected post-Step-1 state if Step 0 said Step 2 is | |
| needed.** Re-enter Phase 4 with Step 2 spec'd. | |
| * C1 ✗ at Phase 7 → drops still happen. Re-enter Phase 4 with | |
| new perf evidence. | |
| |
| ===== 9. Deferred / out of scope ===== | ===== 9. Deferred / out of scope ===== |
| |
| * **Other libva consumers** (mpv-via-vaapi, VLC-via-vaapi) — same | * **Other libva consumers** (mpv-via-vaapi, VLC-via-vaapi) — same Step 1 lifts them indirectly. Verification is Brave-only; gains on other libva consumers are documented at Phase 7 but not required for closure. |
| Step 1 lifts them indirectly. Verification is Brave-only; gains | * **libavcodec hwaccel consumers** (mpv ''gpu-next'', ffplay, VLC ''qt'') — fix surface B from prior Phase 4 enumeration. Separate campaign. |
| on other libva consumers are documented at Phase 7 but not | * **Vulkan-anchored consumers** (libplacebo Vulkan backend on Mali-G52). Fix surface C2 (''panvk-1.2-fakeshim''). Separate campaign. |
| required for closure. | * **HEVC, VP8, VP9, AV1.** RK3566 hantro has H.264 + MPEG2 + VP8 HW only. AV1 / VP9 / HEVC are SW even after Step 1. Out of scope for this campaign's verification. |
| * **libavcodec hwaccel consumers** (mpv ''gpu-next'', ffplay, | * **The S5 zero-drop regression** ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §6 + §8). Side investigation if pursued. |
| VLC ''qt'') — fix surface B from prior Phase 4 enumeration. | * **Other Mali-Bifrost-v7 hardware** (G31 / G51 / G76 — same panvk arch, different SBC stacks). Out of scope; Phase 1's "Mali-G52" framing is hardware-specific. |
| Separate campaign. | * **General-purpose Vulkan workloads.** Phase 1 revised §6 explicit out-of-scope. SW-emulated mandatory-1.2 entry points in any future panvk-fakeshim are tolerated. |
| * **Vulkan-anchored consumers** (libplacebo Vulkan backend on | |
| Mali-G52). Fix surface C2 (''panvk-1.2-fakeshim''). Separate | |
| campaign. | |
| * **HEVC, VP8, VP9, AV1.** RK3566 hantro has H.264 + MPEG2 + VP8 | |
| HW only. AV1 / VP9 / HEVC are SW even after Step 1. Out of scope | |
| for this campaign's verification. | |
| * **The S5 zero-drop regression** | |
| ([[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] §6 + | |
| §8). Side investigation if pursued. | |
| * **Other Mali-Bifrost-v7 hardware** (G31 / G51 / G76 — same | |
| panvk arch, different SBC stacks). Out of scope; Phase 1's | |
| "Mali-G52" framing is hardware-specific. | |
| * **General-purpose Vulkan workloads.** Phase 1 revised §6 | |
| explicit out-of-scope. SW-emulated mandatory-1.2 entry points | |
| in any future panvk-fakeshim are tolerated. | |
| |
| ===== 10. References ===== | ===== 10. References ===== |
| |
| * [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] — | * [[ohm_gl_fix:phase1_revised_2026-05-01|Phase 1 revised]] — measurable success criteria. |
| measurable success criteria. | * [[ohm_gl_fix:phase2_2026-04-30|Phase 2 (substrate)]] — versions, V4L2 9-fd buffer pool, panvk gates, panfrost modifier surface. |
| * [[ohm_gl_fix:phase2_2026-04-30|Phase 2 (substrate)]] — | * [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] — six-contender empirical bucket-attribution + boundary characterisation; the basis for §1's "Brave is libva, not libavcodec" pivot. |
| versions, V4L2 9-fd buffer pool, panvk gates, panfrost modifier | * [[ohm_gl_fix:phase4_2026-04-30|Original Phase 4]] — superseded by this page; preserved for audit trail. |
| surface. | * fourier ''README'' L236-281 — prior libva-v4l2-request investigation and partial multiplanar probe patches that form Step 1's starting point. |
| * [[ohm_gl_fix:phase3_revised_2026-05-01|Phase 3 revised]] — | * Bootlin libva-v4l2-request: [[https://github.com/bootlin/libva-v4l2-request]] |
| six-contender empirical bucket-attribution + boundary | * Local artefact: ''~/fourier-test/libva-patches/fourier-local.patch'' (HEVC-stripped, missing-include fixed, format-enumeration extended for MPLANE). |
| characterisation; the basis for §1's "Brave is libva, not | * marfrit-packages parallel: ''ffmpeg-v4l2-request-git/'' is the template for the new ''libva-v4l2-request-ohm-gl-fix/'' package layout. |
| libavcodec" pivot. | |
| * [[ohm_gl_fix:phase4_2026-04-30|Original Phase 4]] — superseded | |
| by this page; preserved for audit trail. | |
| * fourier ''README'' L236-281 — prior libva-v4l2-request | |
| investigation and partial multiplanar probe patches that form | |
| Step 1's starting point. | |
| * Bootlin libva-v4l2-request: | |
| [[https://github.com/bootlin/libva-v4l2-request]] | |
| * Local artefact: ''~/fourier-test/libva-patches/fourier-local.patch'' | |
| (HEVC-stripped, missing-include fixed, format-enumeration | |
| extended for MPLANE). | |
| * marfrit-packages parallel: ''ffmpeg-v4l2-request-git/'' is the | |
| template for the new ''libva-v4l2-request-ohm-gl-fix/'' package | |
| layout. | |
| |
| ---- | ---- |
| |
| //Phase 4 ends here. Phase 6 (implementation) begins with Step 0, | //Phase 4 ends here. Phase 6 (implementation) begins with Step 0, which produces a small attached decision document on this page. The first ''pacman -U'' on ohm marks Phase 6's first deliverable. Phase 7 is the metrics.csv ''phase7_verify_*'' row(s).// |
| which produces a small attached decision document on this page. The | |
| first ''pacman -U'' on ohm marks Phase 6's first deliverable. | |
| Phase 7 is the metrics.csv ''phase7_verify_*'' row(s).// | |
| |