docs: add spec for model split — scraping vs websearch
parent
37bc849f92
commit
03d4cfb773
@ -0,0 +1,80 @@
|
||||
# Design: Split Model Selection — Scraping vs Web Search
|
||||
|
||||
**Date**: 2026-03-25
|
||||
**Scope**: Rename ai_model_writing to ai_model_websearch, split admin provider models into scraping and websearch lists
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
Currently both model dropdowns (research + writing) show the same model list. The pipeline uses different models for different purposes: cheap/fast models for scraping and classification (Phase 1), and more capable models for web search (Phase 2). The model lists should reflect these different roles.
|
||||
|
||||
## Changes
|
||||
|
||||
### 1. Rename setting column
|
||||
|
||||
```sql
|
||||
ALTER TABLE settings RENAME COLUMN ai_model_writing TO ai_model_websearch;
|
||||
```
|
||||
|
||||
Backend: rename `ai_model_writing` → `ai_model_websearch` everywhere.
|
||||
|
||||
### 2. Split admin_providers models JSONB
|
||||
|
||||
Rename `models` → `models_scraping`. Add `models_websearch`.
|
||||
|
||||
Migration updates existing provider data:
|
||||
|
||||
**OpenAI:**
|
||||
```json
|
||||
{
|
||||
"models_scraping": [
|
||||
{"model_id": "gpt-4o-mini", "display_name": "GPT-4o Mini", "is_default": true},
|
||||
{"model_id": "gpt-4.1-mini", "display_name": "GPT-4.1 Mini", "is_default": false},
|
||||
{"model_id": "gpt-4.1-nano", "display_name": "GPT-4.1 Nano", "is_default": false}
|
||||
],
|
||||
"models_websearch": [
|
||||
{"model_id": "gpt-4o", "display_name": "GPT-4o", "is_default": true},
|
||||
{"model_id": "gpt-4o-mini", "display_name": "GPT-4o Mini", "is_default": false},
|
||||
{"model_id": "gpt-4.1", "display_name": "GPT-4.1", "is_default": false},
|
||||
{"model_id": "gpt-4.1-mini", "display_name": "GPT-4.1 Mini", "is_default": false}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Gemini:** both arrays get the same models (Gemini 2.5 Pro + Flash).
|
||||
|
||||
**Anthropic:** both arrays get the same models (Claude Sonnet 4 + Haiku 3.5).
|
||||
|
||||
### 3. Pipeline model usage
|
||||
|
||||
- `ai_model` → used for Phase 1 (source scraping, per-article classify/summarize, link extraction)
|
||||
- `ai_model_websearch` → used for Phase 2 (web search fallback LLM call)
|
||||
|
||||
### 4. Config endpoint
|
||||
|
||||
`GET /api/v1/config/providers` returns both model arrays per provider so the frontend can populate the correct dropdown.
|
||||
|
||||
### 5. Frontend
|
||||
|
||||
- Rename "Modele d'ecriture" dropdown label → "Modele de recherche web"
|
||||
- `ai_model` dropdown populated from `models_scraping`
|
||||
- `ai_model_websearch` dropdown populated from `models_websearch`
|
||||
|
||||
## Files to Modify
|
||||
|
||||
- **Create:** migration — rename column + update JSONB (rename `models` → `models_scraping`, add `models_websearch`)
|
||||
- **Modify:** `backend/src/models/settings.rs` — rename `ai_model_writing` → `ai_model_websearch`
|
||||
- **Modify:** `backend/src/db/settings.rs` — rename in queries
|
||||
- **Modify:** `backend/src/models/provider.rs` — add `models_websearch` to `AdminProvider` / `ProviderModel` structs
|
||||
- **Modify:** `backend/src/db/providers.rs` — update queries for new JSONB structure
|
||||
- **Modify:** `backend/src/handlers/config.rs` — return both model arrays
|
||||
- **Modify:** `backend/src/handlers/admin.rs` — handle both model arrays in CRUD
|
||||
- **Modify:** `backend/src/services/synthesis.rs` — rename `model_writing` → `model_websearch`, use correct model per phase
|
||||
- **Modify:** `backend/src/services/prompts.rs` — update test fixture
|
||||
- **Modify:** `CLAUDE.md` — migration count
|
||||
- **Modify:** `frontend/src/types.ts` — rename field + add models_websearch to provider config type
|
||||
- **Modify:** `frontend/src/pages/Settings.tsx` — rename dropdown, split model sources
|
||||
- **Modify:** `frontend/src/i18n/fr.ts` — update label
|
||||
- **Modify:** `e2e/tests/generation-live.spec.ts` — update settings payload
|
||||
- **Modify:** `backend/tests/api_syntheses_test.rs` — update settings payload
|
||||
Loading…
Reference in New Issue