You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
118 lines
5.1 KiB
Markdown
118 lines
5.1 KiB
Markdown
# Design: Structural Refactoring (Audit Sub-project B)
|
|
|
|
**Date**: 2026-03-26
|
|
**Scope**: 5 structural refactoring items from the code audit
|
|
**Source**: `docs/audit/00-consolidated-summary.md`
|
|
|
|
---
|
|
|
|
## 1. Decompose `synthesis.rs`
|
|
|
|
### Problem
|
|
`run_generation_inner` is ~650 lines with the scrape+classify batch loop duplicated 3 times (Phase 1, Phase 2 Brave, Phase 2 LLM) and filtering logic duplicated across paths.
|
|
|
|
### Refactoring
|
|
Extract private helper functions within `synthesis.rs`:
|
|
|
|
- **`process_article_batch()`** — takes a batch of URLs, scrapes in parallel (JoinSet), classifies in parallel (JoinSet), updates `filled_counts` / `article_scraped` / `source_counts`. Used by Phase 1 and Phase 2 Brave paths.
|
|
- **`filter_candidate_url()`** — homepage check, cross-phase dedup, history dedup, source diversity check. Returns whether the URL passes. Used by Phase 2 LLM and Phase 2 Brave paths.
|
|
- **`assign_category()`** — maps LLM classification response to category key, handles overflow to "Autre". Used in every classify result handler.
|
|
|
|
No new files — these are private helpers within the pipeline context.
|
|
|
|
### Expected result
|
|
`run_generation_inner` shrinks from ~650 to ~300 lines. No behavioral change.
|
|
|
|
### Files
|
|
- **Modify:** `backend/src/services/synthesis.rs`
|
|
|
|
---
|
|
|
|
## 2. Eliminate `SettingsResponse` struct
|
|
|
|
### Problem
|
|
Every new setting requires synchronized changes to 4 Rust structs (`UserSettings`, `SettingsResponse`, `UpdateSettingsRequest`, `SettingsRow`). `SettingsResponse` is identical to `UserSettings` minus `user_id` and `updated_at`.
|
|
|
|
### Refactoring
|
|
- Add `#[serde(skip_serializing)]` to `user_id` and `updated_at` fields on `UserSettings`
|
|
- Delete `SettingsResponse` struct and its `From<UserSettings>` impl
|
|
- Update the settings GET handler to return `Json(settings)` directly
|
|
|
|
### Expected result
|
|
Future settings changes need 3 structs instead of 4.
|
|
|
|
### Files
|
|
- **Modify:** `backend/src/models/settings.rs` — remove `SettingsResponse`, add skip attributes
|
|
- **Modify:** `backend/src/handlers/settings.rs` — return `UserSettings` directly
|
|
|
|
---
|
|
|
|
## 3. Decompose `Settings.tsx`
|
|
|
|
### Problem
|
|
1,025-line component mixing 7+ concerns.
|
|
|
|
### Refactoring
|
|
Extract sub-components that receive `settings()` accessor + `setSettings()` setter as props:
|
|
|
|
- **`SettingsBraveSearch`** — Brave key input + toggle + auto-disable on delete
|
|
- **`SettingsRateLimit`** — rate limit fields + reset button
|
|
- **`SettingsAdvanced`** — `use_llm_for_source_links`, `article_history_days`, `batch_size`, search behavior
|
|
|
|
The general settings (theme, categories, max_age_days, max_items_per_category, max_articles_per_source) and provider/model selection remain in the main `Settings.tsx` — they are the core of the page and don't benefit from extraction. The goal is to extract the self-contained sections, not to empty the parent.
|
|
|
|
The main `Settings.tsx` remains the container: loads data, holds the settings signal, renders sub-components, owns the save button.
|
|
|
|
### Files
|
|
- **Create:** `frontend/src/components/settings/SettingsBraveSearch.tsx`
|
|
- **Create:** `frontend/src/components/settings/SettingsRateLimit.tsx`
|
|
- **Create:** `frontend/src/components/settings/SettingsAdvanced.tsx`
|
|
- **Modify:** `frontend/src/pages/Settings.tsx` — import and render sub-components
|
|
|
|
---
|
|
|
|
## 4. Extract shared LLM error mapping
|
|
|
|
### Problem
|
|
`map_openai_error`, `map_gemini_error`, `map_anthropic_error` are near-identical functions mapping HTTP status codes to `AppError`.
|
|
|
|
### Refactoring
|
|
- Add `pub fn map_provider_http_error(status: u16, body: &serde_json::Value, provider_name: &str) -> AppError` in `backend/src/services/llm/mod.rs`
|
|
- The shared function handles the common status codes: 401 → `BadRequest("Invalid API key")`, 429 → `RateLimited(...)`, 500+ → `Internal(...)`, and a default fallback
|
|
- Provider-specific status codes (e.g. Anthropic's 529 for overloaded) are handled as pre-processing before calling the shared mapper, or the shared function includes them (529 → `RateLimited` is reasonable for all providers)
|
|
- Each provider's `map_*_error` function is replaced with a call to the shared function, with any provider-specific JSON error extraction done inline before the call
|
|
|
|
### Files
|
|
- **Modify:** `backend/src/services/llm/mod.rs` — add shared function
|
|
- **Modify:** `backend/src/services/llm/openai.rs` — use shared mapper
|
|
- **Modify:** `backend/src/services/llm/gemini.rs` — use shared mapper
|
|
- **Modify:** `backend/src/services/llm/anthropic.rs` — use shared mapper
|
|
|
|
---
|
|
|
|
## 5. Replace `trace_article` 11-parameter function with struct
|
|
|
|
### Problem
|
|
`trace_article` has 11 positional parameters, most `Option`. Call sites are hard to read.
|
|
|
|
### Refactoring
|
|
Create an `ArticleTrace` struct:
|
|
```rust
|
|
struct ArticleTrace<'a> {
|
|
url: &'a str,
|
|
title: &'a str,
|
|
source_type: &'a str,
|
|
source_url: Option<&'a str>,
|
|
category: Option<&'a str>,
|
|
synthesis_id: Option<Uuid>,
|
|
status: &'a str,
|
|
scraped_ok: bool,
|
|
}
|
|
```
|
|
|
|
Change `trace_article` signature to `(pool, user_id, job_id, trace: &ArticleTrace)`.
|
|
Update all call sites to use struct literal syntax.
|
|
|
|
### Files
|
|
- **Modify:** `backend/src/services/synthesis.rs`
|