# Structural Refactoring — Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** 5 structural refactoring items from the code audit: decompose synthesis.rs, eliminate SettingsResponse, decompose Settings.tsx, extract shared LLM error mapping, replace trace_article with struct. **Architecture:** Pure refactoring — no behavioral changes. Each task is independently committable. Tasks 1 and 5 both modify `synthesis.rs` so Task 5 should run after Task 1. **Tech Stack:** Rust (Axum, sqlx), SolidJS, TypeScript **Spec:** `docs/superpowers/specs/2026-03-26-structural-refactoring-design.md` --- ### Task 1: Extract shared helpers from `synthesis.rs` **Files:** - Modify: `backend/src/services/synthesis.rs` This is the highest-impact refactoring. The `run_generation_inner` function (~650 lines) has the scrape+classify batch loop duplicated in Phase 1 (lines ~390-540) and Phase 2 Brave (lines ~610-740), plus filtering logic duplicated in Phase 2 Brave and Phase 2 LLM. - [ ] **Step 1: Read the full file and understand the structure** Read `backend/src/services/synthesis.rs` entirely. Identify: - Phase 1 batch loop (starts around "1b. Scrape, classify, summarize in batches") - Phase 2 Brave batch loop (inside `if settings.use_brave_search`) - Phase 2 LLM filtering (inside the `else` branch) - The category assignment logic (appears after each classify result) - The `trace_article` calls (will be refactored in Task 5, leave as-is for now) - [ ] **Step 2: Extract `assign_category` helper** This logic is duplicated identically in Phase 1 (~lines 497-530) and Phase 2 Brave (~lines 700-740). Extract it into a private helper: ```rust /// Assign an article to a category based on LLM classification response. /// /// Returns `(category_key, category_name)` — e.g. `("category_0", "AI News")`. /// Handles overflow to "Autre" when the target category is full. /// Returns `None` if both the target category and "Autre" are full (article should be skipped). fn assign_category( llm_response: &serde_json::Value, page_title: &str, user_categories: &[String], classification_categories: &[String], filled_counts: &HashMap, max_items_per_category: usize, ) -> Option<(String, String, String, String)> { // Returns (cat_key, cat_name, llm_title, llm_summary) let llm_title = llm_response.get("title").and_then(|t| t.as_str()).unwrap_or(page_title).to_string(); let llm_summary = llm_response.get("summary").and_then(|s| s.as_str()).unwrap_or("").to_string(); let mut llm_category = llm_response.get("category").and_then(|c| c.as_str()).unwrap_or("Autre").to_string(); if !classification_categories.iter().any(|c| c.to_lowercase() == llm_category.to_lowercase()) { llm_category = "Autre".to_string(); } let cat_key = if llm_category.to_lowercase() == "autre" { "category_autre".to_string() } else { user_categories.iter().position(|c| c.to_lowercase() == llm_category.to_lowercase()) .map(|i| format!("category_{}", i)) .unwrap_or_else(|| "category_autre".to_string()) }; let cat_filled = filled_counts.get(&llm_category).copied().unwrap_or(0); if cat_filled >= max_items_per_category && llm_category.to_lowercase() != "autre" { let autre_filled = filled_counts.get("Autre").copied().unwrap_or(0); if autre_filled >= max_items_per_category { return None; // Skip article } Some(("category_autre".to_string(), "Autre".to_string(), llm_title, llm_summary)) } else { Some((cat_key, llm_category, llm_title, llm_summary)) } } ``` Replace both Phase 1 and Phase 2 Brave category assignment blocks with calls to this helper. - [ ] **Step 3: Extract `filter_phase2_url` helper** The filtering logic (homepage, cross-phase dedup, history dedup, source diversity) is duplicated between Phase 2 Brave (~lines 564-595) and Phase 2 LLM (~lines 580-616). Extract: ```rust /// Check if a Phase 2 URL passes all filters. /// Returns the filter reason if rejected, None if accepted. async fn filter_phase2_url( pool: &sqlx::PgPool, user_id: Uuid, url: &str, seen_urls: &std::collections::HashSet, source_counts: &HashMap, article_history_days: i32, max_articles_per_source: usize, ) -> Option<&'static str> { // Homepage filter if let Ok(parsed_url) = url::Url::parse(url) { let path = parsed_url.path(); if path.is_empty() || path == "/" { return Some("filtered_homepage"); } } // Cross-phase dedup if seen_urls.contains(&url.to_lowercase()) { return Some("filtered_cross_phase_dedup"); } // History dedup if article_history_days > 0 { let hash = hash_article_url(url); let exists = db::article_history::check_urls_exist(pool, user_id, std::slice::from_ref(&hash)).await.unwrap_or_default(); if exists.contains(&hash) { return Some("filtered_history"); } } // Source diversity if let Some(domain) = extract_domain(url) { let count = source_counts.get(&domain).copied().unwrap_or(0); if count >= max_articles_per_source { return Some("filtered_diversity"); } } None // Accepted } ``` Replace both Phase 2 Brave and Phase 2 LLM inline filtering with calls to this helper. Each call site still handles `trace_article` with its own `source_type`. - [ ] **Step 4: Build and test** Run: `cd backend && cargo build && cargo test --lib` Expected: All 369 tests pass, no behavioral change - [ ] **Step 5: Commit** ```bash git add backend/src/services/synthesis.rs git commit -m "refactor: extract assign_category and filter_phase2_url helpers from synthesis pipeline" ``` --- ### Task 2: Eliminate `SettingsResponse` struct **Files:** - Modify: `backend/src/models/settings.rs` - Modify: `backend/src/handlers/settings.rs` - [ ] **Step 1: Add `#[serde(skip_serializing)]` to `UserSettings`** In `backend/src/models/settings.rs`, add `#[serde(skip_serializing)]` to the `user_id` and `updated_at` fields of `UserSettings`: ```rust #[derive(Debug, Clone, Serialize)] pub struct UserSettings { #[serde(skip_serializing)] pub user_id: Uuid, pub theme: String, // ... all other fields unchanged ... #[serde(skip_serializing)] pub updated_at: DateTime, } ``` - [ ] **Step 2: Delete `SettingsResponse` and its `From` impl** Delete the entire `SettingsResponse` struct (lines ~29-46) and the `impl From for SettingsResponse` block (lines ~48-66). - [ ] **Step 3: Update handlers** In `backend/src/handlers/settings.rs`: - Remove `use crate::models::settings::SettingsResponse` (or the path it's imported from — check the actual import) - In `get_settings`: change `Ok(Json(SettingsResponse::from(settings)))` to `Ok(Json(settings))` - In `update_settings`: change `Ok(Json(SettingsResponse::from(settings)))` to `Ok(Json(settings))` - [ ] **Step 4: Check for other usages** Run: `cd backend && grep -r "SettingsResponse" src/` — should return no results. - [ ] **Step 5: Build and test** Run: `cd backend && cargo build && cargo test --lib` Expected: All pass - [ ] **Step 6: Commit** ```bash git add backend/src/models/settings.rs backend/src/handlers/settings.rs git commit -m "refactor: eliminate SettingsResponse struct, serialize UserSettings directly" ``` --- ### Task 3: Decompose `Settings.tsx` **Files:** - Create: `frontend/src/components/settings/SettingsBraveSearch.tsx` - Create: `frontend/src/components/settings/SettingsRateLimit.tsx` - Create: `frontend/src/components/settings/SettingsAdvanced.tsx` - Modify: `frontend/src/pages/Settings.tsx` - [ ] **Step 1: Read Settings.tsx and identify section boundaries** Read `frontend/src/pages/Settings.tsx` entirely. Identify: - Brave Search section (~lines 572-670) — key management + toggle - Rate Limit section (~lines 908-980) — two number inputs + effective rate display + reset - Advanced extraction section (~lines 546-570) — checkbox + history days + batch size + search behavior - [ ] **Step 2: Create `SettingsBraveSearch.tsx`** Create `frontend/src/components/settings/SettingsBraveSearch.tsx`. Extract the Brave Search section into a component that receives: ```tsx interface SettingsBraveSearchProps { settings: () => UserSettings; setSettings: SetStoreFunction; // or whatever the setter type is onKeyChanged?: () => void; // to refetch api keys in parent } ``` Move the Brave-specific signals (`braveKeyInput`, `braveSaving`, `braveTesting`), the `braveKey()` derived accessor, and the handler functions (`handleBraveKeySave`, `handleBraveKeyTest`, `handleBraveKeyDelete`) into this component. The component loads its own API keys via `apiKeysApi.list()`. - [ ] **Step 3: Create `SettingsRateLimit.tsx`** Create `frontend/src/components/settings/SettingsRateLimit.tsx`. Extract the rate limit section. Props: ```tsx interface SettingsRateLimitProps { settings: () => UserSettings; setSettings: SetStoreFunction; } ``` - [ ] **Step 4: Create `SettingsAdvanced.tsx`** Create `frontend/src/components/settings/SettingsAdvanced.tsx`. Extract the advanced extraction section (checkbox, history days, batch size, search behavior textarea). Props same pattern. - [ ] **Step 5: Update Settings.tsx to use sub-components** Replace the inline sections with component imports: ```tsx import SettingsBraveSearch from '~/components/settings/SettingsBraveSearch'; import SettingsRateLimit from '~/components/settings/SettingsRateLimit'; import SettingsAdvanced from '~/components/settings/SettingsAdvanced'; ``` The parent keeps: general settings (theme, categories, max_age/items/articles), provider/model selection, API key manager, and the save button. - [ ] **Step 6: TypeScript check** Run: `cd frontend && npx tsc --noEmit` Expected: No errors - [ ] **Step 7: Commit** ```bash git add frontend/src/components/settings/ frontend/src/pages/Settings.tsx git commit -m "refactor: decompose Settings.tsx into sub-components" ``` --- ### Task 4: Extract shared LLM error mapping **Files:** - Modify: `backend/src/services/llm/mod.rs` - Modify: `backend/src/services/llm/openai.rs` - Modify: `backend/src/services/llm/gemini.rs` - Modify: `backend/src/services/llm/anthropic.rs` - [ ] **Step 1: Read all three error mapping functions** Read the `map_*_error` functions in all three provider files. Note the differences: - OpenAI: extracts `error.message` + `error.type`, handles 400/401/403/404/429 - Gemini: extracts `error.message` + `error.status`, merges 401+403, handles 400/401|403/404/429 - Anthropic: extracts `error.message` + `error.type`, handles 400/401/403/404/429/529 - [ ] **Step 2: Add shared mapper in `mod.rs`** In `backend/src/services/llm/mod.rs`, add: ```rust /// Shared HTTP error mapping for LLM provider responses. /// /// Maps common HTTP status codes to `AppError` variants. /// Provider-specific logging should happen before calling this. pub fn map_provider_http_error(status: u16, provider_name: &str) -> AppError { match status { 400 => AppError::BadRequest("Invalid request to LLM provider".into()), 401 => AppError::BadRequest("Invalid or unauthorized API key".into()), 403 => AppError::BadRequest("Access denied by LLM provider".into()), 404 => AppError::BadRequest("Model not found or not available".into()), 429 | 529 => AppError::RateLimited( "LLM provider rate limit exceeded. Please try again later.".into(), ), _ => AppError::Internal(anyhow::anyhow!( "{} returned HTTP {}", provider_name, status )), } } ``` Note: 529 (Anthropic overloaded) is included in the shared mapper as it's semantically equivalent to 429 for any provider. - [ ] **Step 3: Replace each provider's error mapper** In each provider file, replace the `map_*_error` function with a thinner version that logs provider-specific details, then delegates to the shared mapper: **OpenAI:** ```rust fn map_openai_error(status: u16, body: &Value) -> AppError { let error_message = body.get("error").and_then(|e| e.get("message")).and_then(|m| m.as_str()).unwrap_or("Unknown error"); let error_type = body.get("error").and_then(|e| e.get("type")).and_then(|t| t.as_str()).unwrap_or(""); tracing::error!("OpenAI API error (HTTP {}): {} (type: {})", status, error_message, error_type); super::map_provider_http_error(status, "OpenAI") } ``` **Gemini:** ```rust fn map_gemini_error(status: u16, body: &Value) -> AppError { let error_message = body.get("error").and_then(|e| e.get("message")).and_then(|m| m.as_str()).unwrap_or("Unknown error"); let error_status = body.get("error").and_then(|e| e.get("status")).and_then(|s| s.as_str()).unwrap_or(""); tracing::error!("Gemini API error (HTTP {}): {} (status: {})", status, error_message, error_status); super::map_provider_http_error(status, "Gemini") } ``` **Anthropic:** ```rust fn map_anthropic_error(status: u16, body: &Value) -> AppError { let error_message = body.get("error").and_then(|e| e.get("message")).and_then(|m| m.as_str()).unwrap_or("Unknown error"); let error_type = body.get("error").and_then(|e| e.get("type")).and_then(|t| t.as_str()).unwrap_or(""); tracing::error!("Anthropic API error (HTTP {}): {} (type: {})", status, error_message, error_type); super::map_provider_http_error(status, "Anthropic") } ``` - [ ] **Step 4: Build and test** Run: `cd backend && cargo build && cargo test --lib` Expected: All pass - [ ] **Step 5: Commit** ```bash git add backend/src/services/llm/mod.rs backend/src/services/llm/openai.rs backend/src/services/llm/gemini.rs backend/src/services/llm/anthropic.rs git commit -m "refactor: extract shared LLM error mapping to reduce duplication" ``` --- ### Task 5: Replace `trace_article` parameters with `ArticleTrace` struct **Files:** - Modify: `backend/src/services/synthesis.rs` This task should run AFTER Task 1, since both modify `synthesis.rs`. - [ ] **Step 1: Define `ArticleTrace` struct** Add near the top of `synthesis.rs` (in the helper functions section): ```rust /// Structured parameters for article history tracing. struct ArticleTrace<'a> { url: &'a str, title: &'a str, source_type: &'a str, source_url: Option<&'a str>, category: Option<&'a str>, synthesis_id: Option, status: &'a str, scraped_ok: bool, } ``` - [ ] **Step 2: Update `trace_article` signature** Change from 11 parameters to 4: ```rust async fn trace_article( pool: &sqlx::PgPool, user_id: Uuid, job_id: Uuid, trace: &ArticleTrace<'_>, ) { let entry = db::article_history::ArticleHistoryEntry { user_id, url: trace.url.to_string(), url_hash: hash_article_url(trace.url), title: trace.title.to_string(), source_type: trace.source_type.to_string(), source_url: trace.source_url.map(|s| s.to_string()), category: trace.category.map(|s| s.to_string()), synthesis_id: trace.synthesis_id, status: trace.status.to_string(), scraped_ok: trace.scraped_ok, job_id, }; db::article_history::insert_entry(pool, &entry).await.ok(); } ``` - [ ] **Step 3: Update all call sites** Find every `trace_article(` call in the file. Each one changes from positional args to struct literal. Example: ```rust // Before: trace_article(&state.pool, user_id, job_id, &url, "", "personalized_source", Some(&source_url), None, None, "filtered_diversity", false).await; // After: trace_article(&state.pool, user_id, job_id, &ArticleTrace { url: &url, title: "", source_type: "personalized_source", source_url: Some(&source_url), category: None, synthesis_id: None, status: "filtered_diversity", scraped_ok: false, }).await; ``` There are approximately 15-20 call sites. Update all of them. Use `grep -n "trace_article(" backend/src/services/synthesis.rs` to find them all. - [ ] **Step 4: Build and test** Run: `cd backend && cargo build && cargo test --lib` Expected: All pass - [ ] **Step 5: Commit** ```bash git add backend/src/services/synthesis.rs git commit -m "refactor: replace trace_article 11 parameters with ArticleTrace struct" ```