docs: add spec for model split — scraping vs websearch
parent
37bc849f92
commit
03d4cfb773
@ -0,0 +1,80 @@
|
|||||||
|
# Design: Split Model Selection — Scraping vs Web Search
|
||||||
|
|
||||||
|
**Date**: 2026-03-25
|
||||||
|
**Scope**: Rename ai_model_writing to ai_model_websearch, split admin provider models into scraping and websearch lists
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
Currently both model dropdowns (research + writing) show the same model list. The pipeline uses different models for different purposes: cheap/fast models for scraping and classification (Phase 1), and more capable models for web search (Phase 2). The model lists should reflect these different roles.
|
||||||
|
|
||||||
|
## Changes
|
||||||
|
|
||||||
|
### 1. Rename setting column
|
||||||
|
|
||||||
|
```sql
|
||||||
|
ALTER TABLE settings RENAME COLUMN ai_model_writing TO ai_model_websearch;
|
||||||
|
```
|
||||||
|
|
||||||
|
Backend: rename `ai_model_writing` → `ai_model_websearch` everywhere.
|
||||||
|
|
||||||
|
### 2. Split admin_providers models JSONB
|
||||||
|
|
||||||
|
Rename `models` → `models_scraping`. Add `models_websearch`.
|
||||||
|
|
||||||
|
Migration updates existing provider data:
|
||||||
|
|
||||||
|
**OpenAI:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"models_scraping": [
|
||||||
|
{"model_id": "gpt-4o-mini", "display_name": "GPT-4o Mini", "is_default": true},
|
||||||
|
{"model_id": "gpt-4.1-mini", "display_name": "GPT-4.1 Mini", "is_default": false},
|
||||||
|
{"model_id": "gpt-4.1-nano", "display_name": "GPT-4.1 Nano", "is_default": false}
|
||||||
|
],
|
||||||
|
"models_websearch": [
|
||||||
|
{"model_id": "gpt-4o", "display_name": "GPT-4o", "is_default": true},
|
||||||
|
{"model_id": "gpt-4o-mini", "display_name": "GPT-4o Mini", "is_default": false},
|
||||||
|
{"model_id": "gpt-4.1", "display_name": "GPT-4.1", "is_default": false},
|
||||||
|
{"model_id": "gpt-4.1-mini", "display_name": "GPT-4.1 Mini", "is_default": false}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Gemini:** both arrays get the same models (Gemini 2.5 Pro + Flash).
|
||||||
|
|
||||||
|
**Anthropic:** both arrays get the same models (Claude Sonnet 4 + Haiku 3.5).
|
||||||
|
|
||||||
|
### 3. Pipeline model usage
|
||||||
|
|
||||||
|
- `ai_model` → used for Phase 1 (source scraping, per-article classify/summarize, link extraction)
|
||||||
|
- `ai_model_websearch` → used for Phase 2 (web search fallback LLM call)
|
||||||
|
|
||||||
|
### 4. Config endpoint
|
||||||
|
|
||||||
|
`GET /api/v1/config/providers` returns both model arrays per provider so the frontend can populate the correct dropdown.
|
||||||
|
|
||||||
|
### 5. Frontend
|
||||||
|
|
||||||
|
- Rename "Modele d'ecriture" dropdown label → "Modele de recherche web"
|
||||||
|
- `ai_model` dropdown populated from `models_scraping`
|
||||||
|
- `ai_model_websearch` dropdown populated from `models_websearch`
|
||||||
|
|
||||||
|
## Files to Modify
|
||||||
|
|
||||||
|
- **Create:** migration — rename column + update JSONB (rename `models` → `models_scraping`, add `models_websearch`)
|
||||||
|
- **Modify:** `backend/src/models/settings.rs` — rename `ai_model_writing` → `ai_model_websearch`
|
||||||
|
- **Modify:** `backend/src/db/settings.rs` — rename in queries
|
||||||
|
- **Modify:** `backend/src/models/provider.rs` — add `models_websearch` to `AdminProvider` / `ProviderModel` structs
|
||||||
|
- **Modify:** `backend/src/db/providers.rs` — update queries for new JSONB structure
|
||||||
|
- **Modify:** `backend/src/handlers/config.rs` — return both model arrays
|
||||||
|
- **Modify:** `backend/src/handlers/admin.rs` — handle both model arrays in CRUD
|
||||||
|
- **Modify:** `backend/src/services/synthesis.rs` — rename `model_writing` → `model_websearch`, use correct model per phase
|
||||||
|
- **Modify:** `backend/src/services/prompts.rs` — update test fixture
|
||||||
|
- **Modify:** `CLAUDE.md` — migration count
|
||||||
|
- **Modify:** `frontend/src/types.ts` — rename field + add models_websearch to provider config type
|
||||||
|
- **Modify:** `frontend/src/pages/Settings.tsx` — rename dropdown, split model sources
|
||||||
|
- **Modify:** `frontend/src/i18n/fr.ts` — update label
|
||||||
|
- **Modify:** `e2e/tests/generation-live.spec.ts` — update settings payload
|
||||||
|
- **Modify:** `backend/tests/api_syntheses_test.rs` — update settings payload
|
||||||
Loading…
Reference in New Issue