Table of Contents

Claude-Assisted Development Process — 8-Phase Loop

Markus's standardized loop for our implementation work. Apply by default whenever a task is bigger than a one-liner. Skipping phases is a deliberate choice that should be flagged, not a default.

Mirror of claude-his-agent memory entry feedback_dev_process.md. This page is the canonical artifact for Phase 5 second-model handovers — paste measurements/plan in here, redact, then hand the URL to the reviewer.

Phase 1 — Goal Formulation

Define the objective in measurable terms. State what success looks like before touching anything. The chosen metric is a hypothesis about what to measure, not an axiom — Phase 3 may invalidate it.

Phase 2 — Situation Analysis

Document current state. Identify constraints, dependencies, known failure modes. Reset context here — do not carry assumptions from prior sessions; re-read CLAUDE.md, relevant memory files, run git status, re-verify reachability.

Phase 3 — Baseline Measurements

Take concrete measurements before any changes. Paste raw output into DokuWiki at capture time — verbatim, not paraphrased. The Phase 5 artifact is the raw data, not Claude's summary.

Real data, not theatre. Phase 3 exists to use AI capacity for absorbing wide, low-level instrumentation a human reader would skim past. Attaching strace / perf / ftrace / eBPF / custom tripwires to the process under test is real Phase 3; scraping mpv's stdout dropped-frame counter is not. Discriminator: if a human with bash and grep could produce the same baseline, it isn't Phase 3 yet — go down to the syscall / call-path / MMIO / register layer.

Anti-fabrication:

Measurements describe what the system does, not what it should do. Baseline data is evidence, not a specification. Do NOT derive API call sequences, struct layouts, or parameter values from observed behaviour (strace, perf, example output). Observable behaviour may reflect bugs, workarounds, or implementation accidents — anything you copy from it inherits those.

Phase 4 — Plan

Formulate the approach. Identify what will and will not be touched. State expected outcome of implementation in the same measurable terms used in Phase 1/3.

Phase 5 — Second Model Review

Goal, situation, measurements, plan get pasted into DokuWiki (this namespace, claude: subpages). Markus reviews and redacts, then initiates the handover to a fresh model instance. Claude does not curate the artifact going to the reviewer — that would re-introduce the blind-spot accumulation the review is meant to escape. Do not summarize when handing over; paste the actual artifacts.

Phase 6 — Implementation

Execute the plan. Scope strictly to what was planned — resist feature creep, refactor-creep, “while I'm here” cleanups, and over-eager scope expansion. If a plan revision is needed mid-implementation, surface it explicitly and re-enter Phase 4.

Contract before code. Before writing or modifying any call site:

Copying from baseline measurements is not implementation. It is transcription of potentially broken behaviour. A deliverable that matches baseline bytes but violates the API contract is not a deliverable — it is a deferred bug.

What "state the contract explicitly" looks like

Worked example: 0012-h264-omit-scaling-matrix-frame-based.patch in ~/src/ohm_gl_fix/phase6/step1/. The commit message opens with the contract before any code:

VAAPI signals "explicit scaling lists are present in the bitstream"
implicitly: the consumer (ffmpeg-vaapi, mpv, etc.) sends a
VAIQMatrixBufferH264 alongside RenderPicture iff
sps_scaling_matrix_present_flag || pps_scaling_matrix_present_flag.
When the bitstream uses default (flat) scaling, no IQMatrixBuffer
arrives [...]

Earlier draft of this patch unconditionally omitted SCALING_MATRIX in
FRAME_BASED. That's corpus-correct (bbb has no explicit scaling
lists) but the wrong predicate: the kernel-side gating is by
"matrix-supplied vs. not," not by decode mode. Streams that signal
explicit scaling lists must submit SCALING_MATRIX in either mode.

Contract verification (audit_0008_decode_params_2026-05-01.md +
hantro_h264.c::assemble_scaling_list): the kernel uses the supplied
matrix when SCALING_MATRIX is in the control batch and falls back
to spec-defined defaults when absent. Mode-independent.

What this gets right:

Mirror format anywhere reviewable: PR description, commit message body, plan section, or a header comment block. The shape is contract clauses with citations → code that maps 1:1 to those clauses.

Phase 7 — Verification Measurements

Repeat measurements from Phase 3. Compare explicitly against baseline.

Phase 8 — Memory Update

Loop terminates here. Distill the lesson into a memory entry — what was the mistake the loop caught, what's the rule that would shorten the next cycle. Do not let the lesson rot in chat history.

Loopback edges (summary)

Why this exists

Several recurring failures in prior work codify into individual rules — observer-first, simulate-before-flash, three-strikes-then-verify, “trust eyes not vibes,” scope-strictly-to-plan, no-fake-dry-run. Those are all symptoms; this loop is the structural fix. Use it as the spine and let those rules show up as rejection patterns inside the appropriate phases.