docs: add spec for Brave Search API integration in Phase 2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
master
oabrivard 3 months ago
parent 41109b3d93
commit 83c0828392

@ -0,0 +1,106 @@
# Design: Brave Search API Integration for Phase 2
**Date**: 2026-03-25
**Scope**: Add Brave Search as an alternative to LLM web search in Phase 2
---
## Context
Phase 2 currently uses LLM web grounding (Gemini/OpenAI) to find articles for unfilled categories. The results are often imprecise. Brave Search API provides a high-quality web index that can produce more relevant results.
When enabled, Brave Search replaces the LLM web search. The returned URLs are then scraped and classified/summarized by the LLM — same as Phase 1's per-article flow.
Note: The app is French-only (i18n-ready but French for now), so search queries and language are hardcoded to French.
---
## 1. Settings & API Key
### New setting
`use_brave_search: bool` (default `false`) in the `settings` table. When enabled, Phase 2 uses Brave Search API instead of LLM web search.
### API key storage
Stored in `user_api_keys` with `provider_name = "brave_search"`. Reuses existing encrypted storage and CRUD endpoints.
### Frontend: Brave Search section in Settings
A dedicated section in `Settings.tsx` (separate from the LLM `ApiKeyManager` component) with:
- A Brave API key input (using the existing `user_api_keys` CRUD API with `provider_name = "brave_search"`)
- The `use_brave_search` toggle, disabled (grayed out) until a Brave API key is configured
- Key status display (configured/not configured, prefix)
The Brave key is NOT rendered via `ApiKeyManager` (which only renders admin-configured LLM providers). It gets its own standalone section.
### Auto-disable on key deletion
Frontend-side: after a successful `DELETE /api/v1/user/api-keys/brave_search`, the frontend also sends a `PUT /api/v1/settings` with `use_brave_search: false` if it was previously on.
### Backend validation
At generation time, if `use_brave_search` is true but no Brave key is found, return an error (same behavior as a missing LLM provider key).
### Test endpoint
The existing `POST /api/v1/user/api-keys/:provider/test` handler must handle `brave_search` gracefully. Instead of creating an LLM provider (which would fail), it should call the Brave Search API with a simple test query and verify a 200 response.
---
## 2. Brave Search Service
### New file: `backend/src/services/brave_search.rs`
A standalone module that calls the Brave Search API.
**Function:** `search(http_client, api_key, query, count, freshness) -> Result<Vec<BraveResult>>`
**`BraveResult` struct:** `{ title: String, url: String, description: String }`
**API call:**
- `GET https://api.search.brave.com/res/v1/web/search`
- Header: `X-Subscription-Token: {api_key}`
- Query params: `q` (theme + "actualites"), `count` (20), `freshness` (mapped from `max_age_days`), `search_lang` ("fr")
- Parses `web.results` array from JSON response
- Returns up to 20 results
**Freshness mapping from `max_age_days`:**
- `<= 1``"pd"` (past day)
- `<= 7``"pw"` (past week)
- `<= 30``"pm"` (past month)
- `> 30``"py"` (past year)
**Error handling:** On Brave API failure (network error, non-200 status, malformed response), return an `AppError::Internal` with a descriptive message. The generation fails with a clear error — no silent fallback to LLM search.
---
## 3. Phase 2 Pipeline Change
When `use_brave_search` is true:
1. **Decrypt Brave API key** from `user_api_keys` where `provider_name = "brave_search"`
2. **Call Brave Search** — query: `"{theme} actualites"`, count: 20, freshness based on `max_age_days`
3. **Filter** — same as current Phase 2: homepage URL filter, cross-phase dedup (`seen_urls`), article history dedup, source diversity limit. Source type for tracing: `"brave_search"`.
4. **Scrape + LLM classify/summarize** — reuse the same batched loop as Phase 1 (respects `batch_size` setting, parallel scrape via `JoinSet`, parallel LLM classify, `filled_counts` tracking, category overflow to "Autre", `max_total` cap). The LLM rate limiter applies to classify calls (not to the Brave API call itself).
5. **Merge** results into `article_scraped` and update `filled_counts`
When `use_brave_search` is false: the existing LLM web search flow is unchanged.
**Code reuse strategy:** The Phase 1 batched scrape+classify loop (lines ~400-550 of `synthesis.rs`) should be extracted into a shared helper function that both Phase 1 and the Brave Phase 2 path can call, rather than duplicating the logic.
---
## 4. Files to Modify
- **Create:** `backend/migrations/20260325000022_add_brave_search_setting.sql`
- **Create:** `backend/src/services/brave_search.rs` — Brave Search API client + `BraveResult` struct + test function
- **Modify:** `backend/src/services/mod.rs` — add `pub mod brave_search`
- **Modify:** `backend/src/models/settings.rs` — add `use_brave_search` to `UserSettings`, `SettingsResponse`, `UpdateSettingsRequest`, `Default`, validation
- **Modify:** `backend/src/db/settings.rs` — add `use_brave_search` to `SettingsRow`, queries, binds
- **Modify:** `backend/src/services/synthesis.rs` — extract shared scrape+classify helper; Phase 2 branch: if `use_brave_search`, decrypt Brave key, call Brave, filter, run shared helper; else existing LLM search
- **Modify:** `backend/src/handlers/api_keys.rs` — handle `brave_search` in the test endpoint (call Brave API instead of LLM provider)
- **Modify:** `frontend/src/types.ts` — add `use_brave_search: boolean` to `UserSettings` and `DEFAULT_SETTINGS`
- **Modify:** `frontend/src/pages/Settings.tsx` — add standalone Brave Search section with key input + toggle
- **Modify:** `frontend/src/i18n/fr.ts` — labels for toggle, key section, and Brave-specific strings
- **Modify:** `CLAUDE.md` — migration count
Loading…
Cancel
Save