User Tools

Site Tools


cird

CIRD — Can It Run Doom

“Four Cortex-M0 cores. One SoC. One very old FPS. How hard could it be.”

Status: design draft, parked (2026-04-22).

Umbrella: Coulomb (RK3588 stack) — adjacent, not a prerequisite.

The question

RK3588 has (at least) four in-SoC Cortex-M0 cores:

  • PMU0_MCU — always-on, ~8 KB SRAM, PMU0-local peripherals only
  • PMU1_MCU — always-on, ~64 KB SRAM, PMU peripherals + bridge into main bus
  • DDR_MCU — nannies DDR PHY/CTRL, ~32 KB SRAM, narrow outside view
  • BUS_MCU — ~32 KB SRAM, full AXI interconnect view, general-purpose offload

Rules of the game:

  1. The AP does display: a VOP2 overlay plane DMA'd from a DDR carveout.
  2. The AP relays input via mailbox.
  3. Everything else — game tick, BSP traversal, software rasterization — happens on the M0.
  4. “Cheating” allowed: use DDR for code + state. As long as the M0 does the significant work.

Candidate scoring

Core SRAM DDR access Mailbox to AP Background duties Verdict
PMU0_MCU 8 KB none indirect sleep-state pause button only
PMU1_MCU 64 KB narrow bridge PMIC / thermal / S2R runner-up — jitter risk
DDR_MCU 32 KB direct (but owns it) no DDR training + DFS disqualified — stutters on DFS
BUS_MCU 32 KB full AXI none winner

Architecture (straw draft)

  • BUS_MCU runs Doom: game tick, renderer, all logic.
  • DDR carveout: code (~1 MB), WAD (~4 MB shareware), two framebuffers (320×200×1B palette = 64 KB each).
  • AP sets up VOP2 overlay once to scan out from ddr_fb[idx]. Flip idx on mailbox doorbell. One MMIO per frame.
  • Input: AP forwards keyboard/gamepad events via a second mailbox channel (ring buffer in shared SRAM).
  • PMU0_MCU (optional cheek): watchdogs BUS_MCU. If it stops kicking, display “you died” and reset. Completely unnecessary and therefore mandatory.

Ramp-up — what to verify before writing code

  1. Reachability of BUS_MCU's SRAM and reset-vector latch from AP. Mainline Linux has drivers for the Rockchip remoteproc; confirm BUS_MCU is one of the supported instances. TRM chapter on “MCU Subsystem” is the source of truth.
  2. Mailbox channels not already claimed by ATF / BL31 / PSCI. Pick one bidirectional pair.
  3. DDR carveout reservation. memory-region in the DT with no-map, handed to the M0 via a known base address.
  4. Cache coherence. BUS_MCU is almost certainly non-coherent to the AP L3. Either use a non-cacheable mapping on the AP side for the framebuffer, or explicit clean/invalidate around every flip.
  5. VOP2 overlay setup. One plane, 8-bit indexed color, scan-out from our carveout. Drop into KMS as an overlay plane; let the kernel composit (or take the CRTC outright).
  6. Doom port. Chocolate-Doom or the older id release. Strip SDL. Replace the video backend with “write to framebuffer + signal mailbox”. Replace input backend with “read from mailbox ring”. No sound (or mailbox-to-AP-PCM later).

Chicken-and-egg notes

  • The AP bringing up BUS_MCU is fine — the reverse would require PMU1/DDR_MCU to load it, which is silly.
  • No deep-sleep support. When the AP sleeps, BUS_MCU loses power → game over (literal).
  • DFS on the AP is fine; it doesn't touch BUS_MCU. DFS on DDR, however, stalls everyone reading from DDR — including BUS_MCU — for the duration of retraining. Doom will stutter during aggressive DVFS. Pin DDR to a single OPP while playing.

Cheek options (for later)

  • PMU1_MCU variant — Doom that survives an AP kernel panic. echo c > /proc/sysrq-trigger mid-frag, keep playing. Novelty only.
  • Multiplayer M0 — BUS_MCU and PMU1_MCU as two networked players, mailbox as network. Splitscreen via two VOP2 overlay planes.
  • Render on DDR_MCU during idle training windows — do not attempt.

Open questions

  • Is BUS_MCU clocked high enough (a few hundred MHz) to sustain playable Doom? Cortex-M0 at 200 MHz rendering 320×200 software-rasterized — rough math says “probably single-digit FPS, acceptable for the bit, not for actual play”. Need a cycle estimate before committing.
  • Is the DDR latency from BUS_MCU's master port comparable to AP's, or is it routed through a throttled path? TRM will have a block diagram; actual numbers need measurement.
  • Does mainline Linux already expose BUS_MCU as a remoteproc node, or is a DT patch required?

Status / next step

Parked. Pre-req to even starting: finish MegabitChip (DDR blob RE) and at least one clean boot on ampere with our own TPL. Then this becomes “write an M0 firmware and a small kernel driver”, which is a weekend.

Linked from start page.

cird.txt · Last modified: by 127.0.0.1