You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ai_synth/docs/superpowers/specs/2026-03-24-simplify-llm-tra...

80 lines
3.1 KiB
Markdown

# Design: Simplify LLM Provider Trait — Single `call_llm` Method
**Date**: 2026-03-24
**Scope**: Collapse `LlmProvider` trait from 3 methods to 1, unify all provider implementations
---
## Context
The current `LlmProvider` trait has 3 methods: `generate_search_pass` (with web search), `generate_rewrite_pass` (without web search), and `supports_web_search`. This split was designed for an adaptive pipeline that used provider-native web search tools. The pipeline no longer uses web search tools — all providers now receive the same prompt and return structured JSON. The distinction between search and rewrite passes is artificial and adds unnecessary complexity.
## New Trait
```rust
#[async_trait]
pub trait LlmProvider: Send + Sync {
fn provider_id(&self) -> &str;
async fn call_llm(
&self,
model: &str,
system_prompt: &str,
user_prompt: &str,
response_schema: &Value,
) -> Result<Value, AppError>;
}
```
`ProviderCapabilities` struct removed. `supports_web_search()` removed.
## Provider Implementations
### OpenAI
- Uses the Responses API (`POST /v1/responses`) exclusively
- No `web_search_preview` tool — never enabled
- Structured output via `text.format.json_schema`
- Drops the Chat Completions API path entirely (`call_chat_completions_api` removed)
- `call_responses_api` becomes the `call_llm` implementation (without `include_web_search` parameter)
- `extract_responses_api_content` stays (parses Responses API output format)
- `extract_chat_completions_content` removed
### Gemini
- Uses `generateContent` API as before
- No `googleSearch` tool — never enabled
- Structured output via `generationConfig.responseSchema`
- `build_request_body` simplified: `include_search` parameter removed
- Single `call_llm` implementation
### Anthropic
- Uses Messages API as before
- No web search tool
- `call_llm` is the single implementation
## Caller Impact
All 6+ call sites change from `generate_search_pass(...)` or `generate_rewrite_pass(...)` to `call_llm(...)`. Same arguments, just renamed. No logic change.
## Files to Modify
- **Rewrite:** `backend/src/services/llm/mod.rs` — new trait, remove `ProviderCapabilities`
- **Rewrite:** `backend/src/services/llm/openai.rs` — single `call_llm` via Responses API, remove Chat Completions path
- **Rewrite:** `backend/src/services/llm/gemini.rs` — single `call_llm`, remove web search
- **Rewrite:** `backend/src/services/llm/anthropic.rs` — single `call_llm`
- **Modify:** `backend/src/services/llm/factory.rs` — update tests (remove `supports_web_search` assertions)
- **Modify:** `backend/src/services/synthesis.rs` — rename all call sites to `call_llm`
- **Modify:** `backend/src/services/source_scraper.rs` — rename call site
- **Modify:** `backend/src/handlers/api_keys.rs` — rename call site (key test)
## What Does NOT Change
- `Arc<dyn LlmProvider>` — factory still returns `Arc`
- `schema.rs` — all schema builders stay
- `prompts.rs` — all prompt builders stay
- Frontend — no changes
- Database — no changes
- LLM call logging — `log_llm_call` works the same (prompt + response + timing)