Skip to content

MCP sampling approval

ai-ui specs/ai-ui/mcp-sampling-approval.kmd

UI for MCP server-initiated LLM completion requests delegated to the client (which has model + credentials + cost tracking). Two-step approval: pre-LLM (prompt) + post-LLM (response). Audit-logged. Closes MCP Section A of umbrella #099.

When this spec applies

Primary triggers

All triggers

Specification body

Spec — MCP sampling approval

MCP normative source: https://modelcontextprotocol.io/specification/2025-06-18/client/sampling. Pattern advanced — ativável quando algum tool server Koder usar sampling (services/ai/agents ou services/ai/kode no futuro).

Princípios

  1. Two-step approval — pre-LLM (prompt visible + editable) + post-LLM (response visible + approve/regenerate/reject).
  2. Client chooses model — server NÃO especifica model; client decide via model-selector.kmd (#113).
  3. Audit-logged — prompt + response + decision persisted.
  4. Cost-aware — pre-step shows estimated cost (cross-link #112).
  5. Auto-approve opt-in — user pode habilitar auto-approve per server.

R1 — Pre-LLM step

Server requests sampling:

{
  "type": "sampling/createMessage",
  "messages": [...],
  "modelPreferences": {
    "intelligencePriority": 0.8,
    "speedPriority": 0.3
  },
  "systemPrompt": "...",
  "maxTokens": 1000
}

Client renders dialog:

┌───────────────────────────────────────────────┐
│ Server requests LLM completion                │
│ From: <server_origin_chip>                    │
├───────────────────────────────────────────────┤
│ Prompt:                                       │
│ [editable text area with messages preview]   │
├───────────────────────────────────────────────┤
│ Model: [Auto (Kode Relay)] ▾                  │
│ Estimated cost: ~250 tokens, $0.005           │
├───────────────────────────────────────────────┤
│ [Reject] [Approve & run]                      │
└───────────────────────────────────────────────┘

User can edit prompt before approval (audit-logged).

R2 — Post-LLM step

After LLM responds, client shows preview before returning to server:

┌───────────────────────────────────────────────┐
│ LLM response ready for server                 │
├───────────────────────────────────────────────┤
│ [response text]                               │
├───────────────────────────────────────────────┤
│ Tokens: 250 in / 180 out · Cost: $0.005      │
├───────────────────────────────────────────────┤
│ [Reject] [Regenerate] [Send to server]        │
└───────────────────────────────────────────────┘

Regenerate: re-invoke LLM with same prompt (model can switch).

R3 — Auto-approve mode (opt-in)

User can enable auto-approve per (server_id, koder_user_id, workspace_id). When enabled:

  • Pre-step: skipped (prompt auto-approved).
  • Post-step: optionally skipped (depending on additional setting).

Auto-approve disabled by default (security). UI surface in mcp-server-state.kmd (#104) drawer settings.

Auto-revoke: 7 days default (similar to mcp-permission-prompt.kmd R5 High risk).

R4 — Audit log

Each decision emits event to services/foundation/audit/:

event_type: "mcp.sampling.decision"
decision: "APPROVED" | "REJECTED" | "REGENERATED" | "AUTO_APPROVED"
koder_user_id, workspace_id, server_id, model, tokens_in, tokens_out, cost, timestamp

Includes prompt edit if user modified before approval.

R5 — Model selection

model-selector.kmd (#113) integrates: user picks model in pre-step. Server modelPreferences are HINTS, not commands — client decides.

Hint resolution:

Server hintClient behavior
intelligencePriority: 0.8+Prefer high-tier model (Opus, GPT-5)
speedPriority: 0.8+Prefer fast model (Haiku, Mini, Flash)
costPriority: 0.8+Prefer cheap or self-hosted Koder LLM

R6 — Surface bindings

SurfaceAPI
FlutterKoderMCPSamplingDialog em koder_kit/lib/src/ai/mcp_sampling_dialog.dart
Web<koder-mcp-sampling-dialog>
Compose/SwiftUIfuturo
CLI / TUIPrompts inline com Y/N/r (regenerate)

R7 — Acessibilidade

  • Dialog: role="dialog" aria-modal="true".
  • Prompt edit area: <textarea> with aria-label.
  • Cost estimate: aria-describedby dialog.
  • Focus management; ESC cancela (= reject).

R8 — i18n

Keyen-USpt-BR
mcp.sampling.title"Server requests LLM completion""Servidor solicita resposta de IA"
mcp.sampling.prompt_label"Prompt (editable)""Prompt (editável)"
mcp.sampling.action.approve_run"Approve & run""Aprovar e executar"
mcp.sampling.action.regenerate"Regenerate""Gerar de novo"
mcp.sampling.action.send"Send to server""Enviar ao servidor"
mcp.sampling.action.reject"Reject""Recusar"
mcp.sampling.cost.estimate"Estimated cost: ~{tokens} tokens, {cost}""Custo estimado: ~{tokens} tokens, {cost}"
mcp.sampling.auto.enabled"Auto-approve enabled for this server""Aprovação automática ativa pra este servidor"

T-suite

  • T1 Pre-step render: receive sampling request → dialog shows prompt + model picker + cost.
  • T2 Prompt edit: modify prompt → submit includes edited text + audit log entry.
  • T3 Approve & run: → LLM invoked; post-step appears with response.
  • T4 Post-step approve: send to server → sampling/createMessage response.
  • T5 Regenerate: re-invoke LLM same prompt → new response.
  • T6 Reject: emits rejection; server receives error.
  • T7 Model switch in pre-step: change from Auto to Claude Opus → LLM called with selected model.
  • T8 Auto-approve enabled: server requests → pre-step skipped; audit event marks "AUTO_APPROVED".
  • T9 Auto-revoke: 8 days later → auto-approve expired; user re-prompted.
  • T10 Cost transparency: pre-step shows accurate estimate; post-step shows actual.

References