diff --git a/docs/superpowers/specs/2026-03-25-brave-search-design.md b/docs/superpowers/specs/2026-03-25-brave-search-design.md new file mode 100644 index 0000000..9db9da5 --- /dev/null +++ b/docs/superpowers/specs/2026-03-25-brave-search-design.md @@ -0,0 +1,106 @@ +# Design: Brave Search API Integration for Phase 2 + +**Date**: 2026-03-25 +**Scope**: Add Brave Search as an alternative to LLM web search in Phase 2 + +--- + +## Context + +Phase 2 currently uses LLM web grounding (Gemini/OpenAI) to find articles for unfilled categories. The results are often imprecise. Brave Search API provides a high-quality web index that can produce more relevant results. + +When enabled, Brave Search replaces the LLM web search. The returned URLs are then scraped and classified/summarized by the LLM — same as Phase 1's per-article flow. + +Note: The app is French-only (i18n-ready but French for now), so search queries and language are hardcoded to French. + +--- + +## 1. Settings & API Key + +### New setting + +`use_brave_search: bool` (default `false`) in the `settings` table. When enabled, Phase 2 uses Brave Search API instead of LLM web search. + +### API key storage + +Stored in `user_api_keys` with `provider_name = "brave_search"`. Reuses existing encrypted storage and CRUD endpoints. + +### Frontend: Brave Search section in Settings + +A dedicated section in `Settings.tsx` (separate from the LLM `ApiKeyManager` component) with: +- A Brave API key input (using the existing `user_api_keys` CRUD API with `provider_name = "brave_search"`) +- The `use_brave_search` toggle, disabled (grayed out) until a Brave API key is configured +- Key status display (configured/not configured, prefix) + +The Brave key is NOT rendered via `ApiKeyManager` (which only renders admin-configured LLM providers). It gets its own standalone section. + +### Auto-disable on key deletion + +Frontend-side: after a successful `DELETE /api/v1/user/api-keys/brave_search`, the frontend also sends a `PUT /api/v1/settings` with `use_brave_search: false` if it was previously on. + +### Backend validation + +At generation time, if `use_brave_search` is true but no Brave key is found, return an error (same behavior as a missing LLM provider key). + +### Test endpoint + +The existing `POST /api/v1/user/api-keys/:provider/test` handler must handle `brave_search` gracefully. Instead of creating an LLM provider (which would fail), it should call the Brave Search API with a simple test query and verify a 200 response. + +--- + +## 2. Brave Search Service + +### New file: `backend/src/services/brave_search.rs` + +A standalone module that calls the Brave Search API. + +**Function:** `search(http_client, api_key, query, count, freshness) -> Result>` + +**`BraveResult` struct:** `{ title: String, url: String, description: String }` + +**API call:** +- `GET https://api.search.brave.com/res/v1/web/search` +- Header: `X-Subscription-Token: {api_key}` +- Query params: `q` (theme + "actualites"), `count` (20), `freshness` (mapped from `max_age_days`), `search_lang` ("fr") +- Parses `web.results` array from JSON response +- Returns up to 20 results + +**Freshness mapping from `max_age_days`:** +- `<= 1` → `"pd"` (past day) +- `<= 7` → `"pw"` (past week) +- `<= 30` → `"pm"` (past month) +- `> 30` → `"py"` (past year) + +**Error handling:** On Brave API failure (network error, non-200 status, malformed response), return an `AppError::Internal` with a descriptive message. The generation fails with a clear error — no silent fallback to LLM search. + +--- + +## 3. Phase 2 Pipeline Change + +When `use_brave_search` is true: + +1. **Decrypt Brave API key** from `user_api_keys` where `provider_name = "brave_search"` +2. **Call Brave Search** — query: `"{theme} actualites"`, count: 20, freshness based on `max_age_days` +3. **Filter** — same as current Phase 2: homepage URL filter, cross-phase dedup (`seen_urls`), article history dedup, source diversity limit. Source type for tracing: `"brave_search"`. +4. **Scrape + LLM classify/summarize** — reuse the same batched loop as Phase 1 (respects `batch_size` setting, parallel scrape via `JoinSet`, parallel LLM classify, `filled_counts` tracking, category overflow to "Autre", `max_total` cap). The LLM rate limiter applies to classify calls (not to the Brave API call itself). +5. **Merge** results into `article_scraped` and update `filled_counts` + +When `use_brave_search` is false: the existing LLM web search flow is unchanged. + +**Code reuse strategy:** The Phase 1 batched scrape+classify loop (lines ~400-550 of `synthesis.rs`) should be extracted into a shared helper function that both Phase 1 and the Brave Phase 2 path can call, rather than duplicating the logic. + +--- + +## 4. Files to Modify + +- **Create:** `backend/migrations/20260325000022_add_brave_search_setting.sql` +- **Create:** `backend/src/services/brave_search.rs` — Brave Search API client + `BraveResult` struct + test function +- **Modify:** `backend/src/services/mod.rs` — add `pub mod brave_search` +- **Modify:** `backend/src/models/settings.rs` — add `use_brave_search` to `UserSettings`, `SettingsResponse`, `UpdateSettingsRequest`, `Default`, validation +- **Modify:** `backend/src/db/settings.rs` — add `use_brave_search` to `SettingsRow`, queries, binds +- **Modify:** `backend/src/services/synthesis.rs` — extract shared scrape+classify helper; Phase 2 branch: if `use_brave_search`, decrypt Brave key, call Brave, filter, run shared helper; else existing LLM search +- **Modify:** `backend/src/handlers/api_keys.rs` — handle `brave_search` in the test endpoint (call Brave API instead of LLM provider) +- **Modify:** `frontend/src/types.ts` — add `use_brave_search: boolean` to `UserSettings` and `DEFAULT_SETTINGS` +- **Modify:** `frontend/src/pages/Settings.tsx` — add standalone Brave Search section with key input + toggle +- **Modify:** `frontend/src/i18n/fr.ts` — labels for toggle, key section, and Brave-specific strings +- **Modify:** `CLAUDE.md` — migration count