AI cost display

ai-ui specs/ai-ui/cost-display.kmd

Per-message badge + session header total + breakdown drawer for AI cost tracking. Token counts (in/out), monetary cost (configurable currency or Koder credits), model attribution, threshold alerts. Backend: services/ai/billing/.

When this spec applies

Primary triggers

Show user the cost of an AI interaction

All triggers

Display AI cost per message or session
Implement cost-aware product (paid tier, internal usage tracking)

Spec — AI cost display

Backend: services/ai/billing/, services/ai/gateway/. Pricing source: registries/ai-model-recommendations.md OR gateway response metadata.

Princípios

Per-message + session aggregate — both visible.
Configurable currency — BRL / USD / Koder credits.
Discrete by default — small chip; expand for breakdown.
Threshold alerts — configurable soft + hard limits.

R1 — Per-message badge

Compact chip on assistant bubble footer:

🪙 250 in · 180 out · $0.005

Slots: in tokens, out tokens, monetary cost (or credits).

Configurable display: hide cost (only show tokens) or hide tokens (only show cost).

Position: bubble footer, before AI disclaimer chip (cross-link chat-message-bubble.kmd R2).

R2 — Session header total

Chat header strip:

Session: 12 messages · 4,250 tokens · $0.18

Updates live as messages complete. Click → opens breakdown drawer (R3).

R3 — Breakdown drawer

┌────────────────────────────────────────────┐
│ Session cost                               │
├────────────────────────────────────────────┤
│ Total: $0.18 (4,250 tokens)               │
│                                            │
│ By message:                                │
│ #1  Claude Opus 4.7  250t  $0.005          │
│ #2  Claude Opus 4.7  430t  $0.009          │
│ ...                                        │
│                                            │
│ By model:                                  │
│ Claude Opus 4.7: 3,200t · $0.130          │
│ Gemini Pro: 1,050t · $0.050               │
│                                            │
│ By tool call:                              │
│ search:    200t · $0.004                   │
│ fetch_url: 380t · $0.008                   │
└────────────────────────────────────────────┘

R4 — Threshold alerts

User-configurable per workspace:

Limit	Behavior
Soft alert (e.g., $1)	Toast warning; session continues
Hard limit (e.g., $5)	Block new messages; require explicit "Continue anyway"

Defaults disabled (opt-in via Settings).

R5 — Currency / unit configuration

Per (koder_user_id, workspace_id):

[cost_display]
currency = "BRL"          # BRL | USD | EUR | credits
show_tokens = true
show_cost = true
soft_alert_currency = 5.00
hard_limit_currency = 25.00

Conversion rate: gateway maintains exchange rates (cached daily); credits = Koder internal unit (1 credit = $0.001 baseline, configurable).

R6 — Pricing source

Per-request response from gateway includes:

{
  "usage": {
    "tokens_in": 250,
    "tokens_out": 180,
    "cost_usd": 0.005,
    "cost_credits": 5
  },
  "model": "claude-opus-4-7"
}

Fallback: client lookup in ai-model-recommendations.md registry if gateway omits.

R7 — Multi-tenant

Cost attribution per (koder_user_id, workspace_id). Workspace admin can view aggregate; per-user privacy preserved in shared workspaces (admin sees totals, not per-user breakdown unless permission).

R8 — Surface bindings

Surface	API
Flutter	`KoderCostBadge` + `KoderCostBreakdownDrawer` em `koder_kit/lib/src/ai/cost_badge.dart`
Web	`<koder-cost-badge>` + `<koder-cost-breakdown>`
Compose/SwiftUI	futuro
CLI / TUI	Inline `[$0.005]` per message; `koder cost session` shows breakdown

R9 — Acessibilidade

Badge: role="status" aria-label="Cost: 250 tokens, $0.005".
Drawer: role="dialog".
Threshold alert: role="alert" (announces on trigger).

R10 — i18n

Key	en-US	pt-BR
`ai.cost.tokens_in`	"{n} in"	"{n} entrada"
`ai.cost.tokens_out`	"{n} out"	"{n} saída"
`ai.cost.session_total`	"Session: {messages} messages · {tokens} tokens · {cost}"	"Sessão: {messages} mensagens · {tokens} tokens · {cost}"
`ai.cost.alert.soft`	"Approaching cost limit ({current} of {limit})"	"Aproximando do limite ({current} de {limit})"
`ai.cost.alert.hard`	"Cost limit reached"	"Limite de custo atingido"
`ai.cost.alert.continue_anyway`	"Continue anyway"	"Continuar mesmo assim"
`ai.cost.unit.credits`	"credits"	"créditos"

T-suite

T1 Per-message badge: bubble has usage data → badge renders correctly.
T2 Session total: 3 messages → header shows sum.
T3 Breakdown drawer: by message + by model + by tool call sections populated.
T4 Soft alert: cross threshold → toast appears; session continues.
T5 Hard limit: cross limit → new messages blocked; "Continue anyway" enables.
T6 Currency switch: change BRL → USD → all displays update.
T7 Credits mode: switch to credits → no currency symbol; "credits" suffix.
T8 Multi-tenant: workspace switch → cost data for new workspace only.
T9 A11y: alerts announced via aria-live.

Cross-link

Companion: chat-message-bubble.kmd, model-selector.kmd, mcp-tool-invocation.kmd (tool call cost attribution)
Backend: services/ai/billing/, services/ai/gateway/
Registry: registries/ai-model-recommendations.md
Policies: multi-tenant-by-default.kmd
Refs: Langfuse cost tracking, MLflow token usage, Foundry granular metrics

References

meta/docs/stack/specs/ai-ui/chat-message-bubble.kmd
meta/docs/stack/specs/ai-ui/model-selector.kmd
meta/docs/stack/registries/ai-model-recommendations.md