Kover Capture & Data

kover specs/kover/capture.kmd

Define a captura (tela/vídeo/áudio) do Koder Kover e o tratamento dos dados: engine de captura (reusa kcap), metadados, tipos e categorias de dado, formatos de arquivo, organização de armazenamento (records em kdb + bytes no object-plane), retenção tiered, e o gate obrigatório de consentimento + redação antes de exportar para a Koder Observability / Kortex. Implementa as decisões D-capture-* de kover-RFC-001 §5.

Quando esta spec se aplica

Triggers primários

Capturar ou exportar mídia no Kover

Todos os triggers

Capturar tela/vídeo/áudio no Kover
Armazenar ou exportar um bundle de captura do Kover
Enviar dados do Kover para o Kortex via Observability

Spec — Kover Capture & Data (v0.1)

Capture in Kover produces evidence (not telemetry): the bytes and metadata an AI needs to see a behaviour/perf difference. This spec fixes the capture engine, the data model, the storage layout, and the privacy gate. Rules R*, tests T*.

Scope

The capture subsystem (L2) and the data plane it writes to. Does not cover the cockpit UI affordances (KOVER-008) or the connector wire (protocol.kmd).

R1 — Capture engine reuse

R1.1 — Screen/video/audio capture MUST use engines/media/capture (kcap). Kover MUST NOT add a second capture stack (reuse-first).

R1.2 — Capture sources: full screen, a pane/region, or a specific window (by the mirror handle, §scenario-dsl T2). Audio capture is opt-in and off by default.

R1.3 — For deterministic golden capture and reproducible timing, the capture path SHOULD reuse the koder_test_* Layer-1 primitives (koder_test_screencap for golden pixels, koder_test_clock for repro timing; stack-RFC-005) — especially under the headless runner (kover-RFC-001 §7.4), where golden capture gates the build.

R2 — Data categories & types

R2.1 — A capture session groups all artifacts of one Kover run under a session_id. Artifact kind ∈ { screenshot, video, audio, trace, resource-series, scenario, diff, summary }.

R2.2 — File formats (fixed, to keep bundles portable and Kortex-ingestible):

kind	format
screenshot	PNG
video	MP4 (H.264 baseline) / WebM (VP9) — kcap default
audio	Opus in WebM, or WAV for short clips
trace	OTLP JSON (as exported to `infra/observe`)
resource-series	newline-delimited JSON (`resource` payloads, `protocol.kmd` R5)
scenario	the DSL serialisation (`scenario-dsl.kmd`)
diff / summary	JSON + optional Markdown

R3 — Metadata (required per artifact)

R3.1 — Every artifact record carries: session_id, artifact_id, kind, format, created_ts, ts_range (for time-based kinds), source (program slug/pid), byte_ref (object-plane key, R4), sha256, size_bytes, redaction_status ∈ { none, pending, applied }, consent_status ∈ { local-only, export-approved }.

R3.2 — A media-ref emitted over the connector (protocol.kmd R3) points at artifact_id — the connector never carries the bytes inline.

R4 — Storage layout (records vs bytes)

R4.1 — Metadata/records live in the Kover store (kdb): one row per artifact (R3.1) + the session index. Small, queryable, transactional.

R4.2 — Bytes live on the object-storage plane (stack-RFC-006: kdb-obj when it lands, kdrive/local until then). Video/audio in a transactional DB is a physics violation (RFC-006 §1) and is forbidden. The proposal's "store media in its own database" is satisfied by R4.1+R4.2, not by blobs-in-DB.

R4.3 — Object key layout: kover/<session_id>/<kind>/<artifact_id>.<ext>. Content-addressed dedupe by sha256 is SHOULD.

R5 — Retention (tiered)

R5.1 — Raw captures (screenshot/video/audio) default to a short TTL; derived artifacts (diff/summary/scenario) default to a long TTL — mirroring error-reporting-retention (raw short, aggregated long).

R5.2 — TTLs are per-session-overridable by the owner; a "pin" flag exempts a session from GC.

R-redact — Redaction (mandatory)

R-redact.1 — Capture of a real session will contain PII/credentials. Before any artifact's consent_status may become export-approved, a redaction pass MUST run (redaction_status = applied): for video/screen this is at minimum a region-masking pass over fields flagged by the connector or the operator; for text artifacts it is the instrumentation-contract C5 allow-list redactor.

R-redact.2 — observability-first §8 forbids PII in telemetry. Capture is evidence, not telemetry, so it is not auto-blocked — but R-consent gates its egress.

R-consent.1 — No artifact leaves the machine (to Observability/Kortex, L5) without an explicit operator consent action per session. Default consent_status = local-only.

R-consent.2 — Export bundles only artifacts whose redaction_status = applied and consent_status = export-approved. The exporter MUST refuse a bundle containing any non-redacted, non-approved artifact.

R-consent.3 — Ambient daemon mode (kover-RFC-001 §7.3) escalates consent. System-wide passive collection is off by default and requires an explicit, scoped per-application allow-list — never a blanket "record everything". Only allow-listed programs are captured; egress still requires R-consent.1/.2; no silent recording, ever. Bound by identity-data-retention.kmd + multi-tenant-by-default.kmd.

Test cases

#	Check	Severity
T1	A capture session writes a kcap-produced artifact; no second capture lib is linked.	hard
T2	Artifact bytes land on the object plane; the kdb row holds only the record + `byte_ref` (no blob column).	hard
T3	A `media-ref` over the connector resolves to an `artifact_id`, not inline bytes.	hard
T4	Exporter refuses a bundle with an artifact in `redaction_status != applied`.	hard
T5	Exporter refuses egress when `consent_status = local-only`.	hard
T6	Raw capture past its TTL is GC'd; a pinned session survives.	soft
T7	Object keys follow `kover/<session>/<kind>/<id>.<ext>`.	soft

Non-goals

Telemetry log/metric/trace schemas — instrumentation-contract.kmd.
kdb-obj substrate — stack-RFC-006.
Kortex-side analysis of the bundle — services/ai/kortex / kortex-015.

Open questions

Video region-masking at capture-time vs post-process — post-process is simpler but stores an unredacted intermediate (must itself be TTL'd). Lean: capture-time masking for known-sensitive regions, post-process for the rest. Decide at KOVER-005.
Whether scenario + diff should also be a first-class Hub/Flow attachment (shareable A/B report). Defer.

Referências

engines/media/capture
meta/docs/stack/rfcs/stack-RFC-006-object-storage-plane-and-trust-tiering.kmd
meta/docs/stack/rfcs/stack-RFC-005-full-headless-testability.kmd
engines/sdk/koder_test_screencap
meta/docs/stack/policies/observability-first.kmd
meta/docs/stack/specs/privacy
meta/docs/stack/rfcs/kover-RFC-001-foundations.kmd
meta/docs/stack/specs/kover/protocol.kmd