Skip to content

Visual golden widget pattern (headless compositor + screencopy)

patterns specs/patterns/visual-golden-widget.kmd

Canonical recipe for shipping byte-exact visual regression goldens for any single widget in any Koder UI surface (GTK4/Adwaita today; Flutter / web / Android extensions follow the same shape). Established 2026-05-24 in `engines/sdk/koder_kit_gtk` across three iterations (KKGTK-002 R1+R2+R3a, registries #647-#653). Companion to `specs/develop/visual-regression-tdds.kmd § R1 Category C` (the normative test category) — this pattern is the **how**.

When this pattern applies

Primary triggers

All triggers

Specification body

Pattern — Visual golden widget

When to use

A new widget (or new variant of an existing one) ships in a Koder SDK or product, and the team wants byte-exact visual regression detection for it. Three things must already be in place:

  • Compositor: koder-x with WLR_BACKENDS=headless support (Pilot 1 R1 of RFC-005, shipped 2026-05-24).
  • Capture client: grim (wlr-screencopy-v1) reachable from the test host.
  • Container chain: for Adw widgets specifically, an AdwApplicationWindow → AdwPreferencesPage → AdwPreferencesGroup parent chain — discovered as load-bearing in KKGTK-002 R2 (registry #650).

If any of those is missing, the host is not yet wired for this pattern. See policies/test-host-isolation.kmd for the canonical test host (s.khost1.dev-linux-klinux LXC as of 2026-05).

The four artifacts

Per widget, ship four files. Names follow the slug convention <widget_kind> (e.g. adw_switch_row, adw_password_entry_row, adw_action_row):

1. The repro binary — tests/repro_<widget>.c

A tiny GTK application (~50 lines) that constructs the canonical container chain, parameterizes the widget under a single env var (KKGTK_<WIDGET>_STATE or similar), presents the window, and quits after 2 seconds via g_timeout_add_seconds.

Template:

#include <adwaita.h>
#include <gtk/gtk.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void on_activate(GtkApplication *app, gpointer ud) {
    (void)ud;
    const char *state = getenv("KKGTK_<WIDGET>_STATE");
    /* parse state to widget-specific knobs */

    GtkWidget *win = adw_application_window_new(app);
    gtk_window_set_default_size(GTK_WINDOW(win), 600, 200);

    AdwPreferencesPage *page = ADW_PREFERENCES_PAGE(adw_preferences_page_new());
    AdwPreferencesGroup *group = ADW_PREFERENCES_GROUP(adw_preferences_group_new());

    /* construct the widget under test, set state, add to group */

    adw_preferences_page_add(page, group);
    adw_application_window_set_content(ADW_APPLICATION_WINDOW(win), GTK_WIDGET(page));
    gtk_window_present(GTK_WINDOW(win));

    g_timeout_add_seconds(2, (GSourceFunc)g_application_quit, app);
}

int main(int argc, char **argv) {
    AdwApplication *app = adw_application_new(
        "dev.koder.kkgtk.<widget>", G_APPLICATION_DEFAULT_FLAGS);
    g_signal_connect(app, "activate", G_CALLBACK(on_activate), NULL);
    int rc = g_application_run(G_APPLICATION(app), argc, argv);
    g_object_unref(app);
    return rc;
}

2. The goldens — tests/goldens/<widget>_<state>.png

One PNG per distinguishable state, captured once under the test host, then committed. Capture procedure:

  1. Spawn koder-x with WLR_BACKENDS=headless + KODER_X_HEADLESS_TEST_OUTPUTS=1 in a sandboxed XDG_RUNTIME_DIR.
  2. Wait for wayland-N socket.
  3. Run the repro binary with the desired state env var, pointing WAYLAND_DISPLAY + GDK_BACKEND=wayland at the spawned compositor.
  4. After ~1 second (window stabilization), invoke grim -o HEADLESS-1 <out.png>.
  5. Repeat for each state.
  6. Verify md5s differ across states before committing. Identical md5s across distinct states = capture isn't reflecting the state (see KKGTK-002 R2 — wrong container, wrong widget choice, etc.).

3. The check script — tests/headless/golden_check_<widget>.sh

Wraps the capture procedure into a re-runnable assertion. Inputs: no args → check all states; --update → accept current capture as the new golden. Failures save the diverging PNG to tests/goldens/_failures/<widget>_<state>_<ts>.png for investigation.

Template (per-state loop):

check_state() {
    local label="$1"
    local state_value="$2"
    local golden="$GOLDEN_DIR/<widget>_${label}.png"

    KKGTK_<WIDGET>_STATE="$state_value" \
    XDG_RUNTIME_DIR="$SANDBOX" \
    WAYLAND_DISPLAY="$SOCK" \
    GDK_BACKEND=wayland \
        "$BUILD/repro_<widget>" >/dev/null 2>&1 &
    APP_PID=$!
    sleep 1
    grim -o HEADLESS-1 "$SANDBOX/current-${label}.png" 2>/dev/null
    wait "$APP_PID" 2>/dev/null || true

    g_md5=$(md5sum "$golden" | cut -d' ' -f1)
    c_md5=$(md5sum "$SANDBOX/current-${label}.png" | cut -d' ' -f1)
    [ "$g_md5" = "$c_md5" ] || handle_mismatch
}

check_state state_a "value_a"
check_state state_b "value_b"

4. Aggregate runner — auto-picked

tests/headless/run_all_goldens.sh already in koder_kit_gtk globs all golden_check_*.sh in the same directory. No per-widget edit needed; the new check shows up automatically the next time the aggregate runs.

Anti-patterns

These were discovered the hard way during KKGTK-002 progression (registries #647 → #648 → #649 → #650). Avoid:

A1 — Single golden for state-changing widget

A widget with toggleable state (AdwSwitchRow on/off, password entry empty/filled) needs at least two goldens (one per state). Shipping just one and --update-ing it loses regression coverage for the un-captured state.

A2 — Identical md5 across "different" states

If two states produce the same md5, the capture is not catching what you think it is. Stop. Investigate before committing. Likely causes (in priority order):

  1. Wrong parent container — Adw widgets need the AdwPreferencesGroup chain; without it the paint vfunc bails (the gtk_list_box_row_grab_focus: assertion 'box != NULL' failed log line is the giveaway).
  2. Wrong widget choice — some widgets visually don't change for the state you're varying. Pick a state that actually paints differently.
  3. Capture timing — window not fully presented when grim ran. Bump the sleep or wait for a specific frame.

A3 — meson test integration

Don't wire golden_check into meson test. Meson wants binaries present at build time, but goldens need a running compositor + grim + per-host env that varies. Keep the checks as shell scripts invoked from CI / /k-housekeep / release gates.

A4 — paintable-based capture

GTK4's gtk_widget_paintable_new + gdk_paintable_snapshot returns NULL in headless wayland (R3b finding, commit 8940450473). Use compositor screencopy via grim, not widget- side paintable.

Registry

Each shipped widget under this pattern adds a row to registries/visual-regression-coverage.md. Use existing columns: A (overflow) / B (chrome) / C (proportion — this pattern) / D (sibling collision). Most Category C ✅ slots will come from this pattern.

Future surface kinds

Today this pattern is GTK/Adw-specific because the canonical container chain is Adw. Equivalent patterns for other surfaces:

  • FlutterMaterialApp → Scaffold → <widget>. Use golden_toolkit or its koder_test_screencap Dart equivalent. Captured pixels still go through wlroots screencopy if running headless via koder-x.
  • Web (templ + HTMX, Flutter Web) — Playwright / Puppeteer screenshot against the same headless koder-x + a real browser instance. The compositor screencopy path vs the browser screenshot API are equivalent at the byte level once the browser's render is committed.
  • Android native (Compose)createComposeRule() + captureToImage(). Container chain less prescriptive than Adw but Compose's preview infrastructure mostly handles realization automatically.

Ratification

Pattern ratified by working implementation across three widgets in engines/sdk/koder_kit_gtk:

  • AdwSwitchRow (off/on) — registry #650
  • AdwPasswordEntryRow (empty/filled) — registry #651
  • AdwActionRow (title_only/with_subtitle) — registry #652
  • Aggregate runner — registry #653
  • /k-housekeep Phase 2.6 wire — commands/k-housekeep.md

Recipe is reusable as-is for any new Adw widget in any Koder SDK; the four artifacts plus a registry row complete the contract.

References