You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ai_synth/docs/superpowers/specs/2026-03-24-llm-call-logging...

5.5 KiB

Design: LLM Call Logging — Track All LLM Interactions Per Synthesis

Date: 2026-03-24 Scope: Log every LLM call during synthesis generation with full prompt/response, viewable per synthesis


Context

When synthesis quality is poor, there's no way to see what prompts were sent to the LLM or what it returned. Users need visibility into every LLM call to debug prompt effectiveness, model behavior, and pipeline issues.

Approach

New llm_call_log table stores every LLM call with full prompt, response, timing, and model info. Linked to syntheses via job_id. A dedicated log viewer page is accessible from the synthesis list.

New Table: llm_call_log

CREATE TABLE llm_call_log (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id         UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    job_id          UUID NOT NULL,
    call_type       TEXT NOT NULL,
    model           TEXT NOT NULL,
    system_prompt   TEXT NOT NULL DEFAULT '',
    user_prompt     TEXT NOT NULL DEFAULT '',
    response_body   TEXT NOT NULL DEFAULT '',
    duration_ms     INTEGER NOT NULL DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_llm_call_log_job_id ON llm_call_log(job_id);
CREATE INDEX idx_llm_call_log_user_id ON llm_call_log(user_id, created_at);

call_type values: search, classification_phase1, classification_phase2, rewrite, link_extraction, article_extraction, key_test

Pipeline Integration

A helper function log_llm_call inserts a row after each LLM call:

async fn log_llm_call(
    pool: &PgPool,
    user_id: Uuid,
    job_id: Uuid,
    call_type: &str,
    model: &str,
    system_prompt: &str,
    user_prompt: &str,
    response: &serde_json::Value,
    duration_ms: u64,
)

Timing measured with std::time::Instant::now() before each provider call, elapsed().as_millis() after.

Instrumentation points (7 LLM call sites):

  1. Search passcall_type: "search" (Phase 2 web search)
  2. Classification Phase 1call_type: "classification_phase1"
  3. Classification Phase 2call_type: "classification_phase2"
  4. Rewrite passcall_type: "rewrite"
  5. Link extraction (per source, when LLM enabled) — call_type: "link_extraction"
  6. Article extraction (per article, when LLM enabled) — call_type: "article_extraction"
  7. Key testcall_type: "key_test" (API key test endpoint, optional)

Cleanup

During the existing generation startup cleanup (alongside article_history::cleanup_old), truncate old LLM log entries. Entries older than article_history_days:

  • Replace system_prompt, user_prompt, response_body with first 500 chars + \n[truncated]
  • Keep metadata (call_type, model, duration_ms, timestamps) intact

This avoids unbounded storage growth while preserving summary info for old runs.

API Endpoint

GET /api/v1/llm-logs/:job_id

Returns all log entries for a generation job, ordered by created_at. Authenticated, scoped to user (verify the job_id belongs to a synthesis owned by the user).

Response:

[
  {
    "id": "uuid",
    "call_type": "search",
    "model": "gpt-4o-mini",
    "system_prompt": "Tu es un assistant...",
    "user_prompt": "Aujourd'hui nous sommes...",
    "response_body": "{\"category_0\": [...]}",
    "duration_ms": 12500,
    "created_at": "2026-03-24T..."
  }
]

Frontend

LLM Logs page (/llm-logs/:job_id)

  • Shows all LLM calls for a generation run in chronological order
  • Each call displayed as a card:
    • Header: call_type badge (colored), model name, duration (e.g., "12.5s")
    • Three expandable sections: System Prompt, User Prompt, Response
    • Text areas are scrollable, monospace font
    • Response pretty-printed as JSON when parseable

Home page — log button

On each synthesis row in the list, add a small icon button (next to the delete button) that navigates to /llm-logs/:job_id. The job_id comes from the synthesis data. Button hidden for old syntheses without job_id.

Files to Modify

Backend:

  • Create: migration 20260324000017_create_llm_call_log.sql
  • Create: backend/src/db/llm_call_log.rs — insert, list_by_job_id, truncate_old
  • Modify: backend/src/db/mod.rs — register module
  • Create: backend/src/handlers/llm_logs.rs — handler
  • Modify: backend/src/handlers/mod.rs — register
  • Modify: backend/src/router.rs — add route
  • Modify: backend/src/services/synthesis.rs — add log_llm_call helper, wrap each LLM call with timing
  • Modify: CLAUDE.md — migration count to 17

Frontend:

  • Create: frontend/src/pages/LlmLogs.tsx — log viewer page
  • Create: frontend/src/api/llmLogs.ts — API client
  • Modify: frontend/src/App.tsx — add route
  • Modify: frontend/src/pages/Home.tsx — add log button on each synthesis row
  • Modify: frontend/src/i18n/fr.ts — labels
  • Modify: frontend/src/types.tsLlmCallLogEntry type

Tests:

  • Modify: e2e/tests/generation-live.spec.ts — verify LLM logs endpoint returns data

What Does NOT Change

  • LLM provider trait/implementations — logging happens at the call site, not inside providers
  • Pipeline logic — no changes to filtering, classification, or rewrite behavior
  • Article history — independent feature, both use job_id
  • Existing synthesis display — unchanged (only Home page gets the log button)
  • Settings — no new settings (reuses article_history_days for retention)