Analysis of the closed-source Rockchip RK3588 DDR initialization binary blobs, decompiled with Ghidra on oppenheimer (Proxmox CT131 on data).
rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.19.bin (76,704 bytes)The “fast” (2112/2400 MHz) and “conservative” (1848/2112 MHz) blobs have identical code. Only 6 bytes of timing data and 8 bytes of version string differ:
| Offset | Fast (2112/2400) | Conservative (1848/2112) | Purpose |
|---|---|---|---|
| 0x11B8C | 0x0840 | 0x0738 | LP4 frequency parameter |
| 0x11BC0 | 0x0840 | 0x0738 | LP4 frequency (channel 2) |
| 0x11BF4 | 0x6960 | 0x6840 | LP5 frequency parameter |
This means custom DDR frequencies can be set by patching just these bytes.
| Version | LP4 Freq | LP5 Freq | LP5 Data Rate | Status |
|---|---|---|---|---|
| v1.09 - v1.15 | 2112 MHz | 2736 MHz | 5472 MT/s | Dropped after v1.15 |
| v1.16 - v1.19 | 2112 MHz | 2400 MHz | 4800 MT/s | Current default |
| v1.19 (conservative) | 1848 MHz | 2112 MHz | 4224 MT/s | Safe/stable |
| LP5 Clock | Data Rate | BW/channel | Source | Stability |
|---|---|---|---|---|
| 2112 MHz | 4224 MT/s | 8.4 GB/s | Official conservative | Rock solid |
| 2400 MHz | 4800 MT/s | 9.6 GB/s | Official default | Stable |
| 2736 MHz | 5472 MT/s | 10.9 GB/s | Old official (v1.15) | Dropped by Rockchip, works on good modules |
| 3200 MHz | 6400 MT/s | 12.8 GB/s | Community (rkddr tool) | Requires SK Hynix rated modules |
| Speed Grade | Data Rate | Clock | Notes |
|---|---|---|---|
| LPDDR5-3200 | 3200 MT/s | 1600 MHz | Minimum spec |
| LPDDR5-4267 | 4267 MT/s | 2133 MHz | ≈ conservative blob |
| LPDDR5-4800 | 4800 MT/s | 2400 MHz | = default blob |
| LPDDR5-5500 | 5500 MT/s | 2750 MHz | ≈ 2736 blob, TRM “optimized” |
| LPDDR5-6400 | 6400 MT/s | 3200 MHz | Max JEDEC, community OC |
The blob accesses 79 unique hardware registers across 9 blocks:
| Address Range | Block | Registers | Purpose |
|---|---|---|---|
| 0xFD588xxx | PMU1_GRF | 1 | DDR training status |
| 0xFD598xxx | DDR_GRF_CH2 | 1 | Channel 2 config |
| 0xFD5F4/8xxx | BUS_GRF | 27 | DDR bus interconnect, AXI routing, QoS |
| 0xFD8C8xxx | SCRU | 4 | DDR PLL (DPLL) clock gate/reset/config |
| 0xFE010xxx | DDRC_CH0 | 4 | Synopsys UMCTL2 controller |
| 0xFE030xxx | FIREWALL_DDR | 1 | Memory access control |
| 0xFE050xxx | SGRF | 9 | Security - DDR region permissions |
| 0xFECC0xxx | Unknown | 4 | Possibly DDR scramble/ECC |
| 0xFF000xxx | SRAM | 1 | Boot mailbox |
Base addresses verified against RK3588 TRM Part 2 and Linux kernel DT sources.
FUN_000000e4 polls SGRF status (0xFE0500E0) in a tight loop. If SGRF doesn't respond, the system hangs permanently during boot._DAT_fe030040 |= 0xffff opens all DDR firewall masters during init and never re-restricts them.rockchip-rk3588-dmc-oc-3500mhz enables frequency steps up to 3200 MHz for devfreq.Check your DRAM module:
cat /sys/bus/platform/drivers/rockchip-dmc/dmc/devfreq/dmc/available_frequencies
Recommendation: try the old v1.15 blob (2736 MHz) first. If stable, use rkddr for 3200 MHz with stress testing (stressapptest).
Impact for LLM inference: 2400→3200 MHz = ~33% more memory bandwidth = proportional tok/s improvement on memory-bound workloads.
All analysis files on boltzmann:~/src/rk3588-ddr-decompiled/:
ddr_decompiled.c — Decompiled C (fast blob, 118 functions, 11,923 lines)ddr_conservative_decompiled.c — Decompiled C (conservative blob)ddr_diff.txt — Diff between fast and conservativeddr_fast_asm.s / ddr_conservative_asm.s — Full disassembly (17,308 lines each)rk3588_ddr.h — Register definitions header (TRM-verified)rk3588_regs_annotated.h — All 79 MMIO registers with block annotationsDDR_FREQUENCY_TABLE.md — Complete frequency tableANALYSIS.md — Full analysis reportoppenheimer (CT131 on data): /opt/work/ghidra_project/Generated by Claude Code, 2026-04-03
DDR training is the calibration process where the memory controller and PHY find the optimal timing window to reliably communicate with DRAM chips. At 2400-3200 MHz (4800-6400 MT/s), signal integrity is the primary challenge.
Electrical signals on PCB traces experience:
Training compensates by finding the “eye” — the timing/voltage window where data is reliably captured — for each signal individually.
The RK3588 uses a Synopsys DWC (DesignWare Core) LPDDR5/4X multiPHY. The training sequence in the blob:
CalBusy (PHY offset 0x684, 11 uses in code).DfiStatus (offset 0xA24, 65 uses — most-used register).0xAA55AA55 / 0x55AA55AA (written to PHY offsets 0x93C-0x970).Results depend on temperature (shifts ~1-2 ps/°C), voltage, DRAM internal state, and component aging. Results are stored in SRAM (0x001FE000) and passed to the kernel via PMU GRF for DVFS.
The most serious bug class: do {} while loops polling hardware registers indefinitely. If hardware doesn't respond, the system hangs permanently during boot.
| Register | PHY Offset | Polls | Waits For |
|---|---|---|---|
| SGRF_DDR_STATUS | 0xFE0500E0 | 1 | Security GRF ready |
| SGRF_DDR_CON21 | 0xFE050054 | 2 | SGRF config done |
| DfiStatus | +0xA24 | 4 | DFI interface ready |
| MicroContMuxSel | +0x10090 | 4 | PHY firmware mailbox |
| MicroReset | +0x10080 | 2 | PHY firmware reset |
| UctWriteProtShadow | +0x10514 | 5 | Training status |
| CalBusy | +0x684 | 1 | ZQ calibration |
Impact: Cold boot failures, hangs at extreme temperatures, power supply issues during training.
Fix: Add timeout counters. The code already has error return paths (23 instances of return 0xFFFFFFFF) — the polls just don't use them.
ddr_open_firewall() grants all bus masters DDR access (FW_DDR |= 0xFFFF). The matching close may not be called on all error paths.
Training failure restarts the entire sequence from scratch. No selective retry (e.g., “only redo read gate training”). Each failure costs a full retrain (~100ms).
| Metric | Value |
|---|---|
| Total lines | 11,977 |
| Functions | 118 |
| Loops | ~341 |
| Branches | ~1,725 |
| MMIO registers | 79 |
| Error returns | 23 / 1,405 checks (1.6%) |
| PHY register uses | DfiStatus (0xA24): 65 uses (most frequent) |
The RK3588 uses a Synopsys DWC LPDDR5/4X multiPHY (DWC_LPDDR54_PHY). The training stages map to specific register offsets found in the decompiled code:
| PHY Offset | Synopsys Name | Stage | Uses in Code |
|---|---|---|---|
| +0x684 | CalBusy | ZQ Calibration | 11 |
| +0xA24 | DfiStatus | DFI ready / gate training | 65 |
| +0x600/608/60C | VrefDAC | VREF training | 67 |
| +0x10080 | MicroReset | PHY firmware control | 13 |
| +0x10090 | MicroContMuxSel | Firmware ↔ APB mux | many |
| +0x10180 | AcsmPlayback | CA training | 26 |
| +0x10514 | UctWriteProtShadow | Training complete status | 28 |
Written to PHY offsets 0x93C-0x970, this alternating bit pattern maximizes switching noise and crosstalk — the worst-case scenario for signal integrity testing. Variations (0xAAAA5555, 0x55AA55AA) stress different inter-bit coupling scenarios on the PCB.
Full research with 40+ sources in boltzmann:~/src/rk3588-ddr-decompiled/COMMUNITY_RESEARCH.md
Generated by Claude Code, 2026-04-03. Analysis performed on oppenheimer (Proxmox CT131).