MCP sampling approval

ai-ui specs/ai-ui/mcp-sampling-approval.kmd

UI for MCP server-initiated LLM completion requests delegated to the client (which has model + credentials + cost tracking). Two-step approval: pre-LLM (prompt) + post-LLM (response). Audit-logged. Closes MCP Section A of umbrella #099.

When this spec applies

Primary triggers

Approve server-initiated LLM call via client

All triggers

Build MCP client supporting servers that need LLM access
Implement sampling-aware MCP consumer

Spec — MCP sampling approval

MCP normative source: https://modelcontextprotocol.io/specification/2025-06-18/client/sampling. Pattern advanced — ativável quando algum tool server Koder usar sampling (services/ai/agents ou services/ai/kode no futuro).

Princípios

Two-step approval — pre-LLM (prompt visible + editable) + post-LLM (response visible + approve/regenerate/reject).
Client chooses model — server NÃO especifica model; client decide via model-selector.kmd (#113).
Audit-logged — prompt + response + decision persisted.
Cost-aware — pre-step shows estimated cost (cross-link #112).
Auto-approve opt-in — user pode habilitar auto-approve per server.

R1 — Pre-LLM step

Server requests sampling:

{
  "type": "sampling/createMessage",
  "messages": [...],
  "modelPreferences": {
    "intelligencePriority": 0.8,
    "speedPriority": 0.3
  },
  "systemPrompt": "...",
  "maxTokens": 1000
}

Client renders dialog:

┌───────────────────────────────────────────────┐
│ Server requests LLM completion                │
│ From: <server_origin_chip>                    │
├───────────────────────────────────────────────┤
│ Prompt:                                       │
│ [editable text area with messages preview]   │
├───────────────────────────────────────────────┤
│ Model: [Auto (Kode Relay)] ▾                  │
│ Estimated cost: ~250 tokens, $0.005           │
├───────────────────────────────────────────────┤
│ [Reject] [Approve & run]                      │
└───────────────────────────────────────────────┘

User can edit prompt before approval (audit-logged).

R2 — Post-LLM step

After LLM responds, client shows preview before returning to server:

┌───────────────────────────────────────────────┐
│ LLM response ready for server                 │
├───────────────────────────────────────────────┤
│ [response text]                               │
├───────────────────────────────────────────────┤
│ Tokens: 250 in / 180 out · Cost: $0.005      │
├───────────────────────────────────────────────┤
│ [Reject] [Regenerate] [Send to server]        │
└───────────────────────────────────────────────┘

Regenerate: re-invoke LLM with same prompt (model can switch).

R3 — Auto-approve mode (opt-in)

User can enable auto-approve per (server_id, koder_user_id, workspace_id). When enabled:

Pre-step: skipped (prompt auto-approved).
Post-step: optionally skipped (depending on additional setting).

Auto-approve disabled by default (security). UI surface in mcp-server-state.kmd (#104) drawer settings.

Auto-revoke: 7 days default (similar to mcp-permission-prompt.kmd R5 High risk).

R4 — Audit log

Each decision emits event to services/foundation/audit/:

event_type: "mcp.sampling.decision"
decision: "APPROVED" | "REJECTED" | "REGENERATED" | "AUTO_APPROVED"
koder_user_id, workspace_id, server_id, model, tokens_in, tokens_out, cost, timestamp

Includes prompt edit if user modified before approval.

R5 — Model selection

model-selector.kmd (#113) integrates: user picks model in pre-step. Server modelPreferences are HINTS, not commands — client decides.

Hint resolution:

Server hint	Client behavior
`intelligencePriority: 0.8+`	Prefer high-tier model (Opus, GPT-5)
`speedPriority: 0.8+`	Prefer fast model (Haiku, Mini, Flash)
`costPriority: 0.8+`	Prefer cheap or self-hosted Koder LLM

R6 — Surface bindings

Surface	API
Flutter	`KoderMCPSamplingDialog` em `koder_kit/lib/src/ai/mcp_sampling_dialog.dart`
Web	`<koder-mcp-sampling-dialog>`
Compose/SwiftUI	futuro
CLI / TUI	Prompts inline com Y/N/r (regenerate)

R7 — Acessibilidade

Dialog: role="dialog" aria-modal="true".
Prompt edit area: <textarea> with aria-label.
Cost estimate: aria-describedby dialog.
Focus management; ESC cancela (= reject).

R8 — i18n

Key	en-US	pt-BR
`mcp.sampling.title`	"Server requests LLM completion"	"Servidor solicita resposta de IA"
`mcp.sampling.prompt_label`	"Prompt (editable)"	"Prompt (editável)"
`mcp.sampling.action.approve_run`	"Approve & run"	"Aprovar e executar"
`mcp.sampling.action.regenerate`	"Regenerate"	"Gerar de novo"
`mcp.sampling.action.send`	"Send to server"	"Enviar ao servidor"
`mcp.sampling.action.reject`	"Reject"	"Recusar"
`mcp.sampling.cost.estimate`	"Estimated cost: ~{tokens} tokens, {cost}"	"Custo estimado: ~{tokens} tokens, {cost}"
`mcp.sampling.auto.enabled`	"Auto-approve enabled for this server"	"Aprovação automática ativa pra este servidor"

T-suite

T1 Pre-step render: receive sampling request → dialog shows prompt + model picker + cost.
T2 Prompt edit: modify prompt → submit includes edited text + audit log entry.
T3 Approve & run: → LLM invoked; post-step appears with response.
T4 Post-step approve: send to server → sampling/createMessage response.
T5 Regenerate: re-invoke LLM same prompt → new response.
T6 Reject: emits rejection; server receives error.
T7 Model switch in pre-step: change from Auto to Claude Opus → LLM called with selected model.
T8 Auto-approve enabled: server requests → pre-step skipped; audit event marks "AUTO_APPROVED".
T9 Auto-revoke: 8 days later → auto-approve expired; user re-prompted.
T10 Cost transparency: pre-step shows accurate estimate; post-step shows actual.

Cross-link

Companion: mcp-permission-prompt.kmd, model-selector.kmd, cost-display.kmd, mcp-server-state.kmd
Backend: services/ai/mcp/, services/ai/gateway/
Audit: services/foundation/audit/
Refs: https://modelcontextprotocol.io/specification/2025-06-18/client/sampling

References

meta/docs/stack/specs/ai-ui/mcp-permission-prompt.kmd
meta/docs/stack/specs/ai-ui/model-selector.kmd
meta/docs/stack/specs/ai-ui/cost-display.kmd
meta/docs/stack/policies/security.kmd