This is an old revision of the document!
Table of Contents
Phase 2 — KWin source archaeology
This file is the synthesised source-read for the kwin_overlay_subsurface campaign. It opens with the Phase 1 leading question answer (per worklist), follows with the architectural diagram of the per-frame path, and ends with file-level findings ordered by the priority list in worklist.
Discipline guard: no patches are written before this file is committed. Re-scoping is documented honestly with the deferral target named.
Phase 1 — leading question answer
Question (from worklist):
On what condition does KWin promote awp_linux_dmabuf_v1surface to direct scanout versus falling back to GPU composite, and does the hantro NV12DRM_FORMAT_MOD_LINEARoutput satisfy those conditions on this DRM driver (rockchip-drm on RK3568, Mesa 26.0.5)?
Short answer — NO
Neither of KWin v6.6.4's two scanout-promotion paths can place the hantro NV12 LINEAR buffer on a DRM plane on this hardware in the windowed Brave case, for two distinct structural reasons. The Phase 4 design space narrows to the import-caching hypothesis only. This aligns with — does not contradict — the architect's prior from the A2 trajectory hint.
KWin's two scanout-promotion paths
KWin v6.6.4 has two distinct paths that could in principle promote a wp_linux_dmabuf_v1 surface to scanout. Both pass through the same per-layer feasibility check (OutputLayer::importScanoutBuffer) but differ in how the candidate is chosen.
Path A — single-plane direct scanout
Entry: Compositor::prepareDirectScanout (src/compositor.cpp:379, prepareDirectScanout(view, logicalOutput, backendOutput, frame)).
view→scanoutCandidates(1)(compositor.cpp:385). OnWorkspaceScenethis callsWorkspaceScene::scanoutCandidates(src/scene/workspacescene.cpp:281), which walksm_containerItem→sortedChildItems()top-to-bottom via the recursive helperaddCandidates(workspacescene.cpp:197).addCandidatesproduces up tomaxCount + 1 = 2candidateSurfaceItems. The walk requires every traversed item to haveopacity == 1.0and no effects (workspacescene.cpp:202-203).- After the walk, the back element of the candidate list must be either absent OR a 1×1 single-pixel black-background buffer per
checkForBlackBackground(workspacescene.cpp:263-279). OtherwisescanoutCandidatesreturns{}(workspacescene.cpp:306-308). - If a candidate is returned,
prepareDirectScanoutrequires it to be aSurfaceItemWaylandwith a valid surface, valid buffer, and dmabuf attributes (compositor.cpp:391-402). It then takes the format/modifier intersection:layer→supportedDrmFormats()for non-tearing orsupportedAsyncDrmFormats()for tearing must containattrs→formatANDattrs→modifier(compositor.cpp:404-409). - If the intersection passes,
layer→importScanoutBuffer(buffer, frame)is invoked (compositor.cpp:416). For the DRM backend that resolves toEglGbmLayer::importScanoutBuffer(src/backends/drm/drm_egl_layer.cpp:81).
Path B — multi-overlay plane scanout
Entry: Compositor::repaint per-layer loop (compositor.cpp:680+), through Scene::overlayCandidates and assignOverlays. Candidate filtering for an overlay is findOverlayCandidates (workspacescene.cpp:335), which accepts an item iff:
- it is a
SurfaceItem, - non-empty rect,
frameTimeEstimation < 50 ms(≥ 20 fps source frame cadence),surfaceItem→buffer()→dmabufAttributes()is non-null,opacity == 1.0(TODO comment on line 381 says item-opacity is not yet handled),- not entirely covered by other opaque windows,
- if the region is occupied or rounded-corner-clipped, the item must be fully opaque to qualify as an underlay (workspacescene.cpp:386-389).
Per-layer feasibility is then the same EglGbmLayer::importScanoutBuffer gate, with the layer being a non-Primary OutputLayerType.
EglGbmLayer::importScanoutBuffer — the conjunct list
(src/backends/drm/drm_egl_layer.cpp:81-127, top-to-bottom)
- Env var
KWIN_DRM_NO_DIRECT_SCANOUTunset. - Layer is Primary OR
drmOutput()→shouldDisableNonPrimaryPlanes()is false. The latter is only true inPresentationMode::AsyncorAdaptiveAsync(drm_output.cpp:112-117) — i.e. tearing modes — so this conjunct is inactive for Brave's default 30 fps playback. gpu()→needsModeset()is false (no pending modeset).drmOutput()→needsShadowBuffer()is false (no display-side shadow buffer required, e.g. for HDR/colour conversion).gpu() == gpu()→platform()→primaryGpu()(no cross-GPU scanout).- Color pipeline is identity OR
colorPowerTradeoff != PreferAccuracy. sourceRect() == sourceRect().toRect()— the source rect must be integer-aligned. Sub-pixel cropping → reject. Comment cites the kernel doc note that “devices that don't support subpixel plane coordinates can ignore the fractional part.”- If
offloadTransform()is non-identity, the plane must support that transform viam_plane→supportsTransformation. gpu()→importBuffer(buffer, FileDescriptor{})returns non-null (gbm import succeeds for this dmabuf format/modifier/stride).
The doc comment on OutputLayer::importScanoutBuffer (src/core/outputlayer.h:101-106) notes that even when this returns true, “a presentation request on the output must however be used afterwards to find out if it's actually successful” — i.e. the final filter is the kernel's DRM atomic-test.
Where supportedDrmFormats() comes from
(src/backends/drm/drm_plane.cpp:84-142)
DrmPlane::updateProperties() reads the kernel's IN_FORMATS blob via drmModeFormatModifierBlobIterNext. Each (fmt, mod) pair the kernel advertises is added to m_supportedFormats. EglGbmLayer returns this dictionary verbatim from supportedDrmFormats().
So whatever the kernel's rockchip-drm driver advertises in IN_FORMATS for a given DRM plane is what KWin treats as scanout-eligible for that layer. There is no further KWin-side filter on top.
Hardware: rockchip-drm plane format/modifier table on ohm
Raw evidence: ohm_drm_info_2026-05-02.json (inlined), ohm_modetest_planes_2026-05-02.txt (inlined).
DRM driver: rockchip-drm (RockChip Soc DRM, 1.0.0). Active connector: DSI-1 (the PineTab2's internal panel), 800×1280 mode preferred. Two CRTCs visible (51 inactive, 52 active, fb=60).
Three planes (full set on the SoC):
| Plane ID | DRM type | possible_crtcs | KWin OutputLayerType | NV12 LINEAR? | Notes |
|---|---|---|---|---|---|
| 33 | Primary | 0x01 (CRTC 51 only — inactive) | Primary | No (RGB-only LINEAR) | This CRTC has no display attached |
| 39 | Primary | 0x02 (CRTC 52 only — active) | Primary | YES (LINEAR(0x0)) | Currently driving fb=60 (the GL framebuffer) |
| 45 | Overlay | 0x03 (either CRTC) | GenericLayer | No | XR30/XB30/XR/XB/AR/AB 24/RG/BG 24/16, YU08/YU10/YUYV/Y210, all in AFBC modifiers (ARM_BLOCK_SIZE=16×16 family). No NV12 in any modifier. |
CRTC index mapping is positional: CRTC ID 51 = index 0 (bit 0), CRTC ID 52 = index 1 (bit 1). Plane 39 is restricted to CRTC 52; Plane 45 can drive either CRTC. KWin's planeToLayerType (drm_layer.cpp:34-49) maps DRM Primary→OutputLayerType::Primary and DRM Overlay→OutputLayerType::GenericLayer directly.
So on the active CRTC 52, the OutputLayer set KWin sees is:
- 1 ×
OutputLayerType::Primaryfrom Plane 39 — supports NV12 LINEAR. - 1 ×
OutputLayerType::GenericLayerfrom Plane 45 — does not support NV12 in any modifier.
Why the answer is NO — the failing conjunct, named
For Brave's windowed parent + wp_subsurface case:
Path A is rejected at the scene-walk stage
addCandidates (workspacescene.cpp:197-261) walks the Brave window top-to-bottom. The walk would produce two candidates: the wp_subsurface (video) — added first because it has higher z than its parent — and the parent surface (chrome UI). With maxCount=1, WorkspaceScene::scanoutCandidates calls addCandidates with maxCount + 1 = 2, so two candidates are gathered before the inner size check rejects.
After the walk, workspacescene.cpp:306-308 checks ret.size() == maxCount + 1 && !checkForBlackBackground(ret.back()). The back of the list is the parent surface (Brave UI). It is not a 1×1 single-pixel SHM/single-pixel buffer. Therefore checkForBlackBackground returns false, and the function returns {}. Path A returns empty for windowed Brave by construction.
The “black background” idiom is from 8473b90a20 (Xaver Hugl, 2025-09-03, “compositor: move the 'black background' check to workspacescene”) which moved the check from compositor.cpp into the scene. The check exists for fullscreen-on-black-window patterns (some games / video players render a 1×1 black parent window with their actual content as a child surface, to bypass compositor work) — Brave does not use that pattern.
Path B is rejected at the format/modifier intersection
The wp_subsurface (video) clears every findOverlayCandidates filter at 30 fps with NV12 LINEAR dmabufs. The candidate makes it to prepareDirectScanout for a non-Primary OutputLayer. On CRTC 52, the only non-Primary OutputLayer is Plane 45 (GenericLayer). Plane 45 advertises no NV12 modifier in its IN_FORMATS blob.
Therefore compositor.cpp:404-409:
const auto formats = ... layer->supportedDrmFormats(); if (auto it = formats.find(attrs->format); it == formats.end() || !it->contains(attrs->modifier)) { layer->setScanoutCandidate(candidate); candidate->setScanoutHint(layer->scanoutDevice(), formats); return false; }
returns false: formats.find(DRM_FORMAT_NV12) == formats.end() for Plane 45 → reject. Path B is rejected at the format/modifier intersection. No further conjunct in EglGbmLayer::importScanoutBuffer is even evaluated.
The Primary plane (39) does support NV12 LINEAR, but it is in use as the GL framebuffer surface (OutputLayerType::Primary is the single-framebuffer canonical role in KWin). KWin v6.6.4 does not have logic to swap plane roles dynamically (move the GL framebuffer to Plane 45 in AFBC, free Plane 39 for video). That would be a substantial KWin design change.
Implications for Phase 4 design space
Per worklist Phase 1 contract — “yes/no plus a paragraph naming the specific conjunct(s) that pass or fail”:
- Architect's hypothesis (a) — cache the dmabuf-to-GL-texture import. Remains the primary candidate. Aligns with the A2 trajectory data (drops in three bursts during ~30 s warmup, then steady 0/sec). Phase 2 source-read prioritises:
src/wayland/linuxdmabufv1clientbuffer.cpp,src/scene/surfaceitem_wayland.cpp,src/scene/itemrenderer_opengl.cpp. - Architect's hypothesis (b) — promote single-color-plane subsurface video to direct scanout via
wp_drm_lease_v1. STRUCTURALLY UNREACHABLE on this hardware/driver combo. Two reasons:wp_drm_lease_v1is the wrong protocol for this case — it leases an entire connector/output to a client (typical consumer: VR HMDs). It is not the mechanism for putting a subsurface on its own DRM plane within a managed Plasma session. The protocol-correct mechanism would be KWin's existing multi-overlay path (Path B above), which fails at the format/modifier intersection on rockchip-drm.- Even if KWin gained dynamic plane-role swapping, the rockchip-drm overlay plane (Plane 45) does not advertise NV12 in any modifier — that is a kernel-side gap, out of this campaign's scope per README.
Bug-report shape — narrowed
Per worklist: “Either answer also informs the bug-report shape … Different messages, different audiences.”
- The “missed scanout-promotion” framing has two possible audiences, neither well-suited to this campaign:
- KWin maintainers: would require a design-discussion patch (dynamic plane-role swap). Out of scope.
- rockchip-drm maintainers: kernel patch to expose NV12 on the overlay plane (if the VOP2 hardware actually supports it on the overlay window — needs separate VOP2 archaeology). Out of scope per README.
- The “your subsurface composite is slow” framing (Phase 4 hypothesis a — import-caching) has one audience: KWin maintainers, with a measurement-grounded patch description. This is the Phase 5 bug-report shape this campaign should pursue.
Caveats and Phase 1-step-3 deferral
- This answer rests on the assumption that “windowed Brave with chrome UI visible” is the in-scope case (per Phase 1 lock). Fullscreen Brave (F11) would change Path A's outcome — the parent surface might fill the viewport with no second candidate, in which case Path A could potentially succeed if Plane 39 is available. Not measured in Phase 0 / not in scope.
- Phase 1 step 3 from worklist (“does KWin require the subsurface to be the only damageable region of a given plane”) is partially answered by the conjunct list above (rounded-corner clipping is in
findOverlayCandidates, opacity == 1.0 is required, not entirely-covered is required). The deeper question — whether Brave's parent renders content behind the video subsurface region — is deferred to Phase 2 source-read per the proposal accepted on 2026-05-02. It does not change Phase 1's answer because Path B is already disqualified at the format intersection upstream of any geometric considerations. - The integer-source-rect requirement (
drm_egl_layer.cpp:117) is noted but not load-bearing for Phase 1's answer. It would be a load-bearing conjunct if Path B reachedimportScanoutBuffer, which it does not on this hardware. Banked for Phase 2.
Architectural map — to be written during Phase 2 file-level read
(Stub. Phase 2 reading list per worklist: src/wayland/linuxdmabufv1clientbuffer.cpp, src/scene/surfaceitem_wayland.cpp, src/scene/itemrenderer_opengl.cpp, scanout entry points already read for Phase 1, src/backends/drm/.)
Architectural inputs already established:
- KWin 6.6.4 negotiates
wp_linux_drm_syncobj_v1explicit sync with Chromium-class clients. Brave's dmabuf attaches go throughTransaction::watchSyncObj(src/wayland/transaction.cpp:244-249), notTransaction::watchDmaBuf. (Source: kwin-fourier MR body in~/src/marfrit-packages/upstream-submissions/kwin-fourier/kde-mr-body.md, zero-EXPORT_SYNC_FILE strace finding over 60 s playback.) - Fourier patches
0001and0002only touchwatchDmaBuf— irrelevant to Phase 1's scanout-promotion path.
File-level findings — Phase 2 reading list
(To be filled in during Phase 2 source-read.)
- [ ]
src/wayland/linuxdmabufv1clientbuffer.cpp— dmabuf import, fd → GL texture lifetime, cache hit/miss surface. - [ ]
src/scene/surfaceitem_wayland.cpp— per-subsurface state, slow paths on first sight or fd cycle. - [ ]
src/scene/itemrenderer_opengl.cpp— parent+subsurface render path special cases. - [x]
src/scene/composite.cpp+ scene scheduling — promotion predicate. Done in Phase 1. - [x]
src/backends/drm/— DRM atomic plane-probe, format/modifier acceptance per output. Done in Phase 1 to the depth needed for the leading question.
