docs: add comprehensive code audit reports from 5-agent team

Architect, Frontend Expert, Rust Expert, Tech Lead, and Devil's Advocate analyzed the codebase independently. Key findings: UTF-8 panic bug, XSS risk, synthesis.rs duplication (3x batch loop), Settings.tsx complexity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago · 811c5c87d1
parent e74a1850bf
commit 811c5c87d1
6 changed files with 1673 additions and 0 deletions
--- a/docs/audit/00-consolidated-summary.md
+++ b/docs/audit/00-consolidated-summary.md
@ -0,0 +1,127 @@
+# Code Audit — Consolidated Summary
+
+**Date**: 2026-03-26
+**Auditors**: Software Architect, Frontend Expert, Rust Expert, Tech Lead, Devil's Advocate
+
+## Overall Assessment
+
+The codebase is well-structured with strong fundamentals: clean layer separation, zero `unsafe` code, zero `unwrap()` in production paths, comprehensive security measures (AES-256-GCM encryption, SSRF prevention, anti-enumeration timing), and good test coverage (363+ unit tests, 9 integration test files). The code is a learning project that has evolved significantly through rapid iteration.
+
+The main issues cluster around **complexity in the synthesis pipeline** and **accumulated duplication** from feature layering.
+
+---
+
+## P0 — Fix Immediately
+
+These are bugs that can crash or corrupt the application.
+
+| # | Issue | Source | Files |
+|---|-------|--------|-------|
+| 1 | **UTF-8 panic**: `&msg[..200]` panics on multi-byte chars | Rust Expert, Devil's Advocate | `synthesis.rs:1313` |
+| 2 | **Race condition in job creation**: double-click creates duplicate jobs (DashMap check + insert not atomic) | Devil's Advocate | `synthesis.rs` (job management) |
+| 3 | **XSS via `innerHTML`**: renders user-controlled `theme` unsafely | Frontend Expert | `GenerateSynthesis.tsx` |
+
+---
+
+## P1 — Fix Within 30 Days
+
+| # | Issue | Source | Files |
+|---|-------|--------|-------|
+| 4 | **Expired sessions never cleaned up**: `sessions::delete_expired()` exists but is never called | Devil's Advocate | startup / scheduled task |
+| 5 | **No pipeline timeout**: hung LLM call blocks generation slot indefinitely | Devil's Advocate | `synthesis.rs` |
+| 6 | **SSRF bypass via IPv4-mapped IPv6**: `::ffff:127.0.0.1` passes all current checks | Rust Expert, Devil's Advocate | `scraper.rs` |
+| 7 | **Missing SSRF check on initial source page fetch** | Rust Expert | `source_scraper.rs` |
+| 8 | **Spawned generation tasks swallow panics**: SSE clients hang forever | Rust Expert | `generation handler` |
+| 9 | **Missing `onCleanup` for `setTimeout`**: timers fire after component unmount | Frontend Expert | `Home.tsx`, `ArticleHistory.tsx`, `SynthesisDetail.tsx` |
+
+---
+
+## Structural Refactoring — High Priority
+
+These are the main maintainability issues, confirmed by 3+ auditors.
+
+### 1. Decompose `synthesis.rs` (1,686 lines)
+
+**Confirmed by**: Architect, Tech Lead, Rust Expert, Devil's Advocate
+
+The `run_generation_inner` function is ~650 lines with the scrape+classify batch loop duplicated 3 times (Phase 1, Phase 2 Brave, Phase 2 LLM). Article filtering (homepage, dedup, history, diversity) is also duplicated across paths.
+
+**Refactoring plan**:
+- Extract `process_article_batch()` — shared scrape+classify loop (~150 lines saved)
+- Extract `filter_candidate_url()` — shared filtering logic
+- Extract `assign_category()` — shared category assignment + overflow logic
+- Consider splitting into `synthesis/pipeline.rs`, `synthesis/phases.rs`, `synthesis/helpers.rs`
+
+**Effort**: Large | **Impact**: Critical
+
+### 2. Eliminate `SettingsResponse` struct
+
+**Confirmed by**: Tech Lead
+
+`SettingsResponse` is identical to `UserSettings` minus `user_id` and `updated_at`. Every new setting requires changes to 4 Rust structs + SQL queries + frontend types. Serialize `UserSettings` directly with `#[serde(skip)]` on internal fields.
+
+**Effort**: Small | **Impact**: High (reduces per-setting change cost)
+
+### 3. Decompose `Settings.tsx` (1,025 lines)
+
+**Confirmed by**: Frontend Expert, Tech Lead
+
+Split into sub-components: `SettingsGeneral`, `SettingsProvider`, `SettingsAdvanced`, `SettingsBraveSearch`, `SettingsRateLimit`.
+
+**Effort**: Medium | **Impact**: High
+
+### 4. Extract shared LLM error mapping
+
+**Confirmed by**: Tech Lead
+
+`map_openai_error`, `map_gemini_error`, `map_anthropic_error` are near-identical. Extract to a shared `map_http_error` function.
+
+**Effort**: Small | **Impact**: Medium
+
+### 5. Replace `trace_article` 11-parameter function with struct
+
+**Confirmed by**: Tech Lead, Rust Expert
+
+```rust
+struct ArticleTrace { url, title, source_type, source_url, category, synthesis_id, status, scraped_ok }
+```
+
+**Effort**: Small | **Impact**: Medium
+
+---
+
+## Medium Priority Improvements
+
+| # | Issue | Source |
+|---|-------|--------|
+| 10 | N+1 individual INSERTs for article history — batch with `unnest()` | Rust Expert |
+| 11 | 59 `.clone()` calls in pipeline — use `Arc<str>` for shared immutables | Rust Expert |
+| 12 | CSS selectors re-parsed every call — use `LazyLock` | Rust Expert |
+| 13 | Inconsistent data fetching in frontend (`createResource` vs `onMount` + signals) | Frontend Expert |
+| 14 | Reusable `Button` component exists but unused — inline Tailwind duplicated | Frontend Expert |
+| 15 | Default settings diverge between frontend and backend | Tech Lead |
+| 16 | `AppConfig` carries secrets in cloneable `String` — use `Arc`-wrapped | Rust Expert, Devil's Advocate |
+| 17 | In-memory rate limits reset on restart | Devil's Advocate |
+| 18 | `SESSION_SECRET` env var validated but never used | Devil's Advocate |
+
+---
+
+## Positive Highlights
+
+All auditors noted these strengths:
+- **Zero `unsafe` code**, zero `unwrap()` in production
+- **Strong security posture**: AES-256-GCM with zeroized keys, SSRF prevention (14 tests), anti-enumeration timing, parameterized SQL
+- **Clean error handling**: `AppError` enum with proper HTTP mapping, internal errors never leaked
+- **Well-designed `LlmProvider` trait**: clean abstraction over 3 providers
+- **Comprehensive test suite**: good coverage on scraper, LLM providers, rate limiter, encryption
+- **Excellent module documentation**: every module has `//!` doc comments
+
+---
+
+## Detailed Reports
+
+- [Architect Report](architect-report.md) — SOLID principles, design patterns, architecture
+- [Frontend Report](frontend-report.md) — SolidJS patterns, component architecture
+- [Rust Expert Report](rust-expert-report.md) — Idiomatic Rust, async, safety
+- [Tech Lead Report](tech-lead-report.md) — Complexity, duplication, refactoring plan
+- [Devil's Advocate Report](devils-advocate-report.md) — Hidden risks, security, failure modes
--- a/docs/audit/architect-report.md
+++ b/docs/audit/architect-report.md
@ -0,0 +1,237 @@
+# Architecture & SOLID Audit Report
+
+## Executive Summary
+
+AI Weekly Synth demonstrates strong foundational architecture with clean layer separation (handlers, services, DB), a well-designed trait abstraction for LLM providers, and robust error handling. However, the synthesis pipeline (`services/synthesis.rs`) has accumulated significant complexity that violates Single Responsibility and makes the code difficult to test, extend, and maintain. The codebase would benefit most from decomposing the pipeline into smaller, composable stages and introducing trait-based dependency injection for external services.
+
+## SOLID Analysis
+
+### Single Responsibility
+
+**Mostly followed in the outer layers; severely violated in the synthesis pipeline.**
+
+**Good examples:**
+- `errors.rs` has a single job: map application errors to HTTP responses. Clean, focused, well-tested.
+- `middleware/auth.rs` does exactly one thing: extract authenticated users from session cookies. The `AdminUser` wrapper cleanly composes on top.
+- `services/encryption.rs` is a focused module for AES-256-GCM operations with no extraneous concerns.
+- `services/auth.rs` cleanly coordinates token creation, email sending, and session management without overstepping.
+- Each file in `db/` handles queries for exactly one table (users, sessions, syntheses, etc.).
+
+**Critical violation: `services/synthesis.rs` (~1250 lines)**
+
+This single file is responsible for:
+1. Job lifecycle management (create, subscribe, cleanup via `JobStore`)
+2. Progress event definition and emission
+3. Provider and API key resolution
+4. Rate limiting coordination
+5. Source rotation strategy
+6. Parallel scraping orchestration (Phase 1)
+7. LLM classification and summarization orchestration (Phase 1)
+8. Brave Search integration orchestration (Phase 2)
+9. LLM web search fallback orchestration (Phase 2)
+10. URL normalization, hashing, and deduplication
+11. Article history filtering
+12. Source diversity enforcement
+13. Section assembly and persistence
+14. Article provenance tracing
+15. Error sanitization
+
+The `run_generation_inner` function alone spans ~640 lines. The nearly identical scrape-then-classify batch processing loop is duplicated between Phase 1 (lines 406-551) and the Brave Search path (lines 627-758), with only minor differences in tuple shapes. This duplication is a strong signal that a shared abstraction is missing.
+
+**Other SRP observations:**
+- `handlers/auth.rs` mixes HTTP concern (cookie construction in `build_session_cookie`) with business logic (token verification in `verify_token`). The cookie construction could live in a utility or the auth service.
+- `AppState` (in `app_state.rs`) holds both infrastructure (pool, config) and domain state (job_store, rate_limiters). This is acceptable at current scale but should be watched as the application grows.
+
+### Open/Closed
+
+**Good for LLM providers; poor for the synthesis pipeline.**
+
+**Good examples:**
+- The `LlmProvider` trait plus `factory::create_provider` follow OCP well. Adding a new provider (e.g., Mistral) requires implementing the trait and adding one match arm in the factory -- no existing code needs modification.
+- `AppError` uses enum variants with `thiserror` derives. Adding a new error category is additive.
+- The `FromRequestParts` extractors (`AuthUser`, `AdminUser`) compose cleanly -- adding new authorization levels means adding new extractors.
+
+**Violations:**
+- The synthesis pipeline has hardcoded knowledge of all search strategies (LLM search, Brave Search, personalized sources). Adding a new search backend (e.g., Serper, SearXNG) requires modifying the 640-line orchestration function. There is no `SearchBackend` trait or strategy pattern.
+- `factory::create_provider` uses a match statement that must be modified for each new provider. A registration-based factory (e.g., `HashMap<&str, fn(String, Client) -> Arc<dyn LlmProvider>>`) would eliminate this.
+- The `ProviderRateLimiter` embeds SQL queries directly (line 173 of `rate_limiter.rs`) rather than going through the `db` module, bypassing the architectural boundary. The comment says "to avoid circular dependency" -- this signals a structural issue.
+
+### Liskov Substitution
+
+**Correctly applied where traits are used.**
+
+- All three `LlmProvider` implementations (`GeminiProvider`, `OpenAiProvider`, `AnthropicProvider`) correctly implement the trait contract. Callers use `Arc<dyn LlmProvider>` throughout and never downcast.
+- The `AnthropicProvider` implementation is noteworthy: it adapts to Anthropic's lack of native JSON schema enforcement by embedding the schema in the system prompt. This is a legitimate behavioral difference that is hidden behind the trait -- exactly what LSP intends. The contract is "return structured JSON matching the schema," not "use native schema enforcement."
+- `FromRequestParts` implementations for `AuthUser` and `AdminUser` correctly maintain the contract expectations.
+
+**No LSP violations detected.** The trait abstractions are lean and appropriately designed.
+
+### Interface Segregation
+
+**Traits are lean and focused -- no violations.**
+
+- `LlmProvider` has only two methods: `provider_id()` and `call_llm()`. This is appropriately minimal. If web search support needed to be optional (e.g., only some providers support grounded search), a separate `WebSearchProvider` trait would be warranted, but currently the pipeline handles this externally.
+- The codebase uses free functions in modules (e.g., `db::users::find_by_email`) rather than trait objects for the database layer. This is idiomatic Rust and avoids over-engineering trait hierarchies.
+
+**Observation:** The absence of traits for services like scraping, email, and encryption is acceptable given the project's scale, but limits testability (see Dependency Inversion section).
+
+### Dependency Inversion
+
+**Partially applied -- the weakest SOLID principle in this codebase.**
+
+**Good:**
+- Handlers depend on `AppState` (an abstraction container), not directly on database implementations.
+- The synthesis pipeline depends on `Arc<dyn LlmProvider>`, not on concrete provider types.
+
+**Significant gaps:**
+- **No abstraction over the database layer.** All services and the synthesis pipeline call `db::*` functions directly (e.g., `db::settings::get_or_create_default`, `db::article_history::check_urls_exist`). This means the synthesis pipeline cannot be unit-tested without a live Postgres instance.
+- **No abstraction over the scraper.** The synthesis pipeline calls `scraper::scrape_url` and `source_scraper::extract_article_links` as concrete functions. This prevents testing pipeline logic with deterministic, fake scraped content.
+- **No abstraction over the email service.** `services/email.rs` is called directly. Testing email-sending behavior requires either a live Resend API key or the hardcoded test bypass.
+- **`AppState` is a concrete struct, not a trait.** All handlers are bound to `State<AppState>`. While this is standard Axum practice, it means handler logic cannot be tested without constructing a full `AppState` with a real database pool.
+
+The integration tests in `tests/common/mod.rs` confirm this: `TestApp` must create a real Postgres database for every test, run migrations, and tear down afterward. The test helper works well, but the inability to unit-test service logic in isolation is a direct consequence of missing abstractions.
+
+## Design Patterns
+
+### Well-Used Patterns
+
+1. **Strategy Pattern (LLM Providers):** The `LlmProvider` trait + `factory::create_provider` is a textbook Strategy implementation. Providers encapsulate provider-specific API differences (request format, response parsing, error mapping) behind a uniform interface. The factory cleanly maps configuration to concrete strategy.
+
+2. **Observer Pattern (Progress Streaming):** The `watch` channel pattern for SSE progress streaming is elegant. The `JobStore` manages pub/sub lifecycle, and the `WatchStream` adapter in the handler converts it to an SSE stream. This design correctly handles reconnection (subscribers immediately get the latest state).
+
+3. **Extractor Pattern (Authentication):** Axum's `FromRequestParts` is used idiomatically for `AuthUser` and `AdminUser`. This is effectively a Decorator pattern -- handlers declare their auth requirements via type parameters, and the framework handles the cross-cutting concern transparently.
+
+4. **Builder Pattern (Configuration):** `AppConfig::from_env()` + `validate()` is a clean two-phase construction pattern. Validation is separated from loading, allowing the application to fail fast with clear error messages.
+
+5. **Value Object Pattern:** `MasterKey` is a well-designed value object. It enforces invariants on construction (`from_hex`), prevents accidental cloning (no `Clone` derive), and zeroizes memory on drop. This is security-conscious Rust.
+
+6. **Repository Pattern (DB layer):** Each `db::*` module acts as a repository, encapsulating SQL queries behind typed function signatures. The separation between the `UserRow` (DB representation) and `User` (domain model) in `db/users.rs` is a good Data Mapper implementation.
+
+### Missing Patterns
+
+1. **Pipeline/Chain of Responsibility:** The synthesis pipeline is a sequential series of stages (load settings -> resolve provider -> scrape sources -> classify articles -> web search fallback -> assemble -> save). This is a textbook pipeline pattern that would benefit from explicit stage modeling:
+   ```
+   trait PipelineStage {
+       async fn execute(&self, ctx: &mut PipelineContext) -> Result<(), AppError>;
+   }
+   ```
+   Each stage would be independently testable and composable.
+
+2. **Unit of Work / Transaction:** The synthesis pipeline performs many database writes (article history inserts, LLM call logs, final synthesis save) without a unifying transaction. If the synthesis save fails after article history has been written, the data is inconsistent. A transactional boundary around the save phase would prevent orphaned records.
+
+3. **Circuit Breaker:** The rate limiter waits up to 60 seconds for rate limit windows to pass, but there is no circuit breaker for provider failures. If a provider returns 5xx errors repeatedly, the pipeline will keep retrying individual articles until the batch is exhausted. A circuit breaker that short-circuits after N consecutive failures would save time and API quota.
+
+4. **Result Collector / Accumulator:** The batch processing loops manually manage `scraped_articles`, `filled_counts`, `source_counts`, and `seen_urls` as mutable state threaded through the loop. An accumulator pattern would encapsulate this state and its invariants (e.g., "never exceed max_items_per_category") into a dedicated struct with methods like `try_add_article()`.
+
+### Anti-Patterns
+
+1. **God Function:** `run_generation_inner` (~640 lines) is the most impactful anti-pattern. It manages too many concerns, has deeply nested control flow, and is untestable in isolation. The function's length makes it difficult to reason about edge cases and encourages bugs during modification.
+
+2. **Duplicated Logic (DRY Violation):** The scrape-then-classify batch loop appears in near-identical form in three places:
+   - Phase 1 personalized sources (lines ~406-551)
+   - Phase 2 Brave Search path (lines ~627-758)
+   - The response parsing differs slightly, but the core pattern (spawn scrape tasks -> collect results -> spawn classify tasks -> collect and categorize) is the same.
+
+   A shared `process_article_batch()` function accepting a list of `(url, source_identifier)` would eliminate ~150 lines of duplication.
+
+3. **Primitive Obsession:** Categories are tracked throughout the pipeline using raw strings and index-based keys (`"category_0"`, `"category_autre"`). A `CategoryIndex` newtype or enum would prevent bugs like off-by-one errors and make the intent clearer.
+
+4. **Implicit Protocol (Stringly Typed):** The `status` field in article history uses raw strings (`"filtered_history"`, `"filtered_diversity"`, `"used"`, `"filtered_homepage"`, etc.). An enum would provide compile-time guarantees and prevent typos.
+
+5. **Configuration Object Overload:** `UserSettings` has 16 fields that are all passed around as a single struct. The synthesis pipeline destructures individual fields from it in many places (e.g., `settings.max_items_per_category`, `settings.batch_size`, `settings.use_brave_search`). This makes it hard to understand which settings each stage actually depends on.
+
+## Architecture Assessment
+
+### Layer Separation
+
+**Clean three-layer architecture with appropriate boundaries.**
+
+```
+handlers/ (HTTP) -> services/ (Business Logic) -> db/ (Data Access)
+```
+
+- **Handlers** correctly limit themselves to request parsing, validation, calling services/DB, and response formatting. No handler contains business logic.
+- **Services** implement the business rules. The auth service coordinates between DB, email, and token utilities. The encryption service is self-contained.
+- **DB modules** are pure data access with no business logic. SQL queries are parameterized and safe.
+
+**Minor boundary violations:**
+- `ProviderRateLimiter::reload_from_db` in `services/rate_limiter.rs` contains raw SQL queries instead of calling through `db::rate_limits`. The comment acknowledges this is to "avoid circular dependency," suggesting the module boundaries need adjustment.
+- `resolve_model` in `services/synthesis.rs` also contains raw SQL (`sqlx::query_scalar` with inline SQL) instead of going through `db::providers`. This bypasses the DB layer abstraction.
+
+### Coupling
+
+**Low coupling between modules; high coupling within the synthesis pipeline.**
+
+- Handler modules are independent of each other. Each handler file imports only from `app_state`, `errors`, `middleware`, relevant `models`, and relevant `services`/`db` modules.
+- The LLM provider implementations are fully decoupled -- they share no state and have no knowledge of each other.
+- The `db` modules have zero inter-module dependencies (no `db::users` calling `db::sessions`, etc.).
+
+**High coupling in the synthesis pipeline:**
+- `services/synthesis.rs` imports from 9 different modules: `app_state`, `db` (5 submodules), `errors`, `models` (2 submodules), `services` (5 submodules). It is the most connected module in the codebase and acts as a "gravity well" that pulls in all dependencies.
+- The `AppState` struct couples all concerns: config, pool, HTTP client, auth rate limiter, provider rate limiter, user rate limiters, and job store. Any change to any of these affects the shared state type.
+
+### Error Handling
+
+**Excellent error handling strategy -- one of the strongest aspects of the codebase.**
+
+- **Single error type:** `AppError` is the only error type in the application. All handlers return `Result<impl IntoResponse, AppError>`. This is clean and consistent.
+- **Automatic conversions:** `From<sqlx::Error>` and `From<anyhow::Error>` provide ergonomic `?` operator usage while ensuring errors are properly categorized.
+- **Security-conscious:** Internal errors log the full details server-side but return only "An internal error occurred" to the client. The `sanitize_error_message` function in the synthesis pipeline ensures API keys are never exposed in SSE error events.
+- **No `unwrap()` in production paths:** Consistent with the project's stated policy. The few `unwrap()` calls are in test code or behind `expect()` with clear messages (e.g., `"valid minutes value"`).
+- **Error variants are semantically meaningful:** `NotFound`, `Unauthorized`, `Forbidden`, `BadRequest`, `Validation`, `Internal`, `RateLimited` -- each maps to a specific HTTP status code with appropriate behavior.
+
+**Minor observation:** The `Validation` variant accepts a `String`, but a structured error type (e.g., with field name + message) would be more useful for frontends that want to highlight specific form fields.
+
+### State Management
+
+**In-memory state is well-designed with appropriate concurrency primitives.**
+
+- `DashMap` is used for lock-free concurrent access in `RateLimiter`, `ProviderRateLimiter`, `JobStore`, and `user_rate_limiters`. This avoids `Mutex` contention under load.
+- `Arc<watch::Sender>` for progress events is the right choice -- it provides "latest value" semantics so reconnecting SSE clients immediately get caught up.
+- The `JobStore` has TTL-based cleanup (1 hour) and post-completion cleanup (5 minutes), preventing memory leaks from abandoned jobs.
+
+**Potential concern:** The `RateLimiter` in `rate_limiter.rs` stores timestamps in a `Vec<Instant>` (for the per-key limiter) rather than a `VecDeque`. Old timestamps are evicted via `retain()` which is O(n), whereas `VecDeque::pop_front()` is O(1). The `ProviderRateLimiter` correctly uses `VecDeque`. This inconsistency is minor but worth unifying.
+
+**State growth risk:** The `user_rate_limiters: DashMap<Uuid, UserRateLimitEntry>` in `AppState` grows unboundedly as users trigger generation jobs. Entries are only removed when a user's settings change to have no rate limits. A periodic cleanup (e.g., remove entries for users who haven't generated in 24 hours) would prevent slow memory growth.
+
+## Prioritized Recommendations
+
+1. **Decompose `services/synthesis.rs` into a multi-stage pipeline.** Extract the monolithic `run_generation_inner` into discrete, testable stages: `SettingsStage`, `SourceScrapingStage`, `ClassificationStage`, `WebSearchStage`, `AssemblyStage`. Each stage should operate on a shared `PipelineContext` struct. This addresses the God Function anti-pattern, the DRY violation in batch processing, and enables unit testing of individual stages.
+   - Files: `backend/src/services/synthesis.rs` (1250+ lines)
+   - Impact: Testability, maintainability, extensibility
+
+2. **Extract a shared `process_article_batch()` function** as an immediate, lower-risk improvement. The near-identical scrape-then-classify loop in Phase 1 and the Brave Search path can be unified into a single function that accepts `Vec<(String, Option<String>)>` (URL + optional source URL) and returns classified `NewsItem`s. This can be done without the full pipeline refactor.
+   - Files: `backend/src/services/synthesis.rs` (lines ~406-551 and ~627-758)
+   - Impact: ~150 lines of duplication removed, fewer bugs from divergent copies
+
+3. **Introduce trait-based abstractions for external services** to enable unit testing. Define traits for `ArticleScraper`, `SearchService`, and `ArticleStore` (history). The synthesis pipeline would depend on these traits, allowing tests to inject fakes. Start with the DB layer trait since it blocks the most test coverage.
+   - Files: `backend/src/services/synthesis.rs`, `backend/src/services/scraper.rs`, `backend/src/db/article_history.rs`
+   - Impact: Enables unit testing of synthesis pipeline without Postgres
+
+4. **Replace stringly-typed status values with enums.** The article history `status` field uses raw strings (`"filtered_history"`, `"filtered_diversity"`, `"used"`, etc.). Define an `ArticleStatus` enum with `Display`/`FromStr` implementations. Similarly, replace the `"category_0"` / `"category_autre"` string keys with a `CategoryKey` newtype.
+   - Files: `backend/src/services/synthesis.rs`, `backend/src/db/article_history.rs`
+   - Impact: Type safety, compile-time correctness guarantees
+
+5. **Move raw SQL out of service modules.** `ProviderRateLimiter::reload_from_db` and `resolve_model` in `synthesis.rs` contain inline SQL queries that bypass the `db/` layer. Move these queries to `db/rate_limits.rs` and `db/providers.rs` respectively. If circular dependencies are the issue, restructure the module hierarchy (e.g., make the rate limiter accept configuration data rather than a pool).
+   - Files: `backend/src/services/rate_limiter.rs` (lines 171-199), `backend/src/services/synthesis.rs` (lines 1191-1220)
+   - Impact: Architectural consistency, single source of truth for SQL
+
+6. **Introduce an article accumulator struct** to encapsulate the mutable tracking state in the synthesis pipeline. Currently, `article_scraped`, `source_counts`, `url_source`, `filled_counts`, and `seen_urls` are managed as independent `HashMap`/`HashSet` variables threaded through loops. A `SynthesisAccumulator` struct with methods like `try_add_article(url, category, source) -> bool` would encapsulate invariants (max per category, max per source, deduplication) and simplify the pipeline code.
+   - Files: `backend/src/services/synthesis.rs`
+   - Impact: Reduced cognitive complexity, fewer state-management bugs
+
+7. **Add a circuit breaker for LLM provider failures.** If a provider returns errors for N consecutive calls, stop sending requests and fail fast. The current behavior retries each article independently, wasting time and API quota when the provider is down.
+   - Files: `backend/src/services/synthesis.rs`
+   - Impact: Resilience, user experience during outages
+
+8. **Unify `Vec<Instant>` to `VecDeque<Instant>` in `RateLimiter`.** The per-key `RateLimiter` uses `Vec::retain()` (O(n)) while `ProviderRateLimiter` uses `VecDeque::pop_front()` (O(1)). Switch the per-key limiter to `VecDeque` for consistency and performance under high request volumes.
+   - Files: `backend/src/services/rate_limiter.rs` (lines 37-77)
+   - Impact: Minor performance improvement, code consistency
+
+9. **Add periodic cleanup for `user_rate_limiters` in `AppState`.** The `DashMap<Uuid, UserRateLimitEntry>` grows unboundedly. Add a background task (similar to `JobStore` TTL cleanup) that removes entries for users inactive for more than 24 hours.
+   - Files: `backend/src/app_state.rs`, `backend/src/main.rs`
+   - Impact: Prevents slow memory growth in long-running deployments
+
+10. **Consider making `UpdateSettingsRequest::validate()` return `Result<(), AppError>` instead of `Result<(), String>`.** Currently, handlers must map the string error: `body.validate().map_err(AppError::Validation)`. Having validate return the correct error type directly would be more ergonomic and consistent with the rest of the codebase.
+    - Files: `backend/src/models/settings.rs`
+    - Impact: Minor ergonomic improvement, consistency
--- a/docs/audit/devils-advocate-report.md
+++ b/docs/audit/devils-advocate-report.md
@ -0,0 +1,309 @@
+# Devil's Advocate Audit Report
+
+**Date**: 2026-03-26
+**Scope**: Full codebase audit of AI Weekly Synth (Rust/Axum + SolidJS)
+**Approach**: Adversarial review -- challenging assumptions, finding hidden risks, questioning decisions
+
+---
+
+## Executive Summary
+
+AI Weekly Synth is a well-structured project with thoughtful security measures. However, it suffers from several uncomfortable truths:
+
+1. **The master encryption key is reconstructed from hex on every single API key operation** -- it is never cached. This means the raw key string sits in `AppConfig` (a `Clone`-able struct) for the entire process lifetime, defeating the purpose of `zeroize`.
+2. **There is no background cleanup for expired sessions or stale jobs.** The sessions table and article_history table will grow unbounded until someone notices.
+3. **The generation pipeline has no cancellation or timeout mechanism.** A hung LLM call can block a user's generation slot indefinitely since the per-user "one active job" check has no upper time bound beyond the 1-hour TTL.
+4. **Rate limiting is entirely in-memory and per-process.** A restart resets all rate limit state, and horizontal scaling (multiple instances) would have no shared rate limit enforcement.
+5. **The SSRF protection has a documented TOCTOU vulnerability** (DNS rebinding) that is acknowledged but not mitigated.
+
+---
+
+## Architectural Concerns
+
+### Why Rust for This?
+
+The core business logic -- calling LLM APIs, scraping HTML, shuffling JSON -- is I/O-bound, not CPU-bound. Rust's primary advantage (zero-cost abstractions, memory safety without GC) adds relatively little value here compared to Python/FastAPI or TypeScript/Express, while significantly increasing development velocity cost.
+
+**Counter-argument the team would give**: "It's a learning project." Fair. But the CLAUDE.md itself says "idiomatic Rust (learning project)" -- which means the architecture was chosen for developer education, not because it's the right tool. This is honest but means the codebase carries cognitive overhead that a maintainer would not expect.
+
+**Concrete risk**: Contribution and maintenance will be limited to Rust-proficient developers, which is a small pool. If the original author steps away, someone inheriting this codebase will face a steep ramp-up compared to an equivalent Python project.
+
+### The Two-Phase Pipeline Is Overengineered
+
+The synthesis pipeline (`synthesis.rs`, ~1300 lines) implements:
+- Phase 1: Source scraping with parallel link extraction, history dedup, batch classification
+- Phase 2: Web search fallback (LLM or Brave) with its own scraping + classification loop
+
+This is the same scrape-classify-summarize logic duplicated twice (lines 429-543 and lines 639-757 are nearly identical patterns). The Brave Search path and the LLM Search path also duplicate filtering logic (homepage filter, cross-phase dedup, history dedup, source diversity -- all repeated in three places).
+
+**Concrete risk**: A bug fix in one path (e.g., the dedup logic) must be manually replicated in all three paths. The recent commit history confirms this is already happening: `fix: filter empty scraped articles + restore URLs after rewrite` touched multiple places.
+
+### Single-Process In-Memory State
+
+`JobStore`, `RateLimiter`, `ProviderRateLimiter`, and `user_rate_limiters` are all in-memory `DashMap` structures. This means:
+- **No horizontal scaling**: Two instances would each allow one concurrent job per user.
+- **State loss on restart**: All active generation jobs are silently lost.
+- **No persistence of rate limit state**: A process restart resets all rate limit windows.
+
+For a "single-tenant self-hosted" app this is acceptable today, but the architecture does not admit a path to multi-instance deployment without significant rework.
+
+---
+
+## Hidden Technical Debt
+
+### 1. MasterKey Reconstruction on Every Operation
+
+Every time an API key is encrypted or decrypted, the code calls `MasterKey::from_hex(&state.config.master_encryption_key)`. This means:
+- A new `Vec<u8>` is allocated, filled with key material, used, then zeroized on drop.
+- But `state.config.master_encryption_key` (the hex string) lives in `AppConfig`, which derives `Clone`. Every clone of `AppState` carries a copy of the raw hex key string in memory. The `zeroize` on `MasterKey::drop` protects one copy while leaving the canonical source in plain memory.
+
+**File**: `backend/src/config.rs` (line 16), `backend/src/services/encryption.rs`
+
+**Risk scenario**: A memory dump or core dump of the process would reveal `MASTER_ENCRYPTION_KEY` in cleartext, because `AppConfig::master_encryption_key` is a `String` that is never zeroized. The `MasterKey` wrapper's `zeroize` is security theater in this context.
+
+### 2. No Session Cleanup Scheduler
+
+The `sessions::delete_expired` function exists (line 99 of `db/sessions.rs`) but is **never called** by any background task or scheduler. Expired sessions accumulate forever. The auth middleware checks expiration on each request (and deletes that one session), but sessions that simply expire without being used again are never cleaned.
+
+**File**: `backend/src/db/sessions.rs` (line 98-105)
+
+**Risk scenario**: After a year of operation, the sessions table could have tens of thousands of expired rows. Each auth check does an index lookup, so performance is not the concern -- the concern is data hygiene and compliance (old session metadata like IP addresses and user agents sitting in the database indefinitely).
+
+### 3. article_history and llm_call_log Unbounded Growth
+
+The `article_history` table stores a row for **every article encountered** during generation -- not just used articles, but also filtered, dropped, and deduplicated ones. With 5 categories, 15 links per source, and 10+ sources, a single generation run can produce 100+ rows. The `cleanup_old` function (called at the start of each generation) only deletes entries where `synthesis_id IS NULL`, meaning "used" entries are retained forever.
+
+The `llm_call_log` table stores full prompts and responses (system_prompt, user_prompt, response_body as TEXT). A single generation run with 20 LLM calls storing full prompts and responses could easily be 500KB+ per run.
+
+**Files**: `backend/migrations/20260324000015_add_article_history.sql`, `backend/migrations/20260324000017_create_llm_call_log.sql`
+
+**Risk scenario**: A user generating weekly syntheses for a year produces ~52 runs x 100 history rows = 5,200+ article_history rows and ~52 x 20 = 1,040 llm_call_log entries (with full prompt text). For the LLM logs, this could be 25MB+ per user per year. There is no size-based cleanup or archival strategy.
+
+### 4. LLM API Changes Will Break Silently
+
+The LLM provider implementations (`gemini.rs`, `openai.rs`, `anthropic.rs`) parse JSON responses from vendor APIs. When these APIs change their response format (which happens regularly), the failure mode is a parse error that surfaces as a generic "Generation failed" error to the user.
+
+There is no version pinning on the API endpoints (e.g., Gemini uses `v1beta`, OpenAI uses `v1`), and no structured error handling that distinguishes "API changed" from "API key invalid" from "model deprecated."
+
+**Risk scenario**: Google deprecates a Gemini model identifier. The user's settings still reference it. Generation fails with an opaque error. The user has no way to diagnose the problem from the error message.
+
+---
+
+## Security Assessment
+
+### Encryption: Sound Implementation, Flawed Key Lifecycle
+
+The AES-256-GCM implementation itself is textbook correct:
+- Random 12-byte nonces via OS randomness
+- Proper use of the `aes_gcm` crate
+- `zeroize` on `MasterKey` drop
+- No nonce reuse (each encryption generates a fresh nonce)
+
+**But the key lifecycle is the vulnerability**:
+- `MASTER_ENCRYPTION_KEY` is stored as a plain `String` in `AppConfig`
+- `AppConfig` derives `Clone`, so the key string is duplicated across every `AppState` clone
+- The key is passed through environment variables, which may be visible in `/proc/<pid>/environ` on Linux
+- There is no key rotation mechanism. If the key is compromised, every user's API keys must be re-encrypted with a new key -- and there is no tooling for this.
+
+**File**: `backend/src/config.rs` (line 16), `backend/src/services/encryption.rs`
+
+### SSRF: Good but Incomplete
+
+The SSRF protection is well-designed:
+- DNS resolution check before connection
+- Private IP detection for both IPv4 and IPv6
+- Redirect policy with per-hop IP validation
+- Scheme validation (http/https only)
+
+**Gaps**:
+1. **DNS Rebinding (documented)**: The comment on line 62 of `scraper.rs` acknowledges TOCTOU between DNS check and actual TCP connection. An attacker controlling DNS could return a public IP for the check, then rebind to `127.0.0.1` for the actual connection.
+2. **IPv4-mapped IPv6 not checked**: Addresses like `::ffff:127.0.0.1` or `::ffff:10.0.0.1` are not explicitly blocked. Depending on the network stack, these could bypass the IPv4 private checks.
+3. **Cloud metadata endpoints**: `169.254.169.254` is blocked (link-local), but cloud providers often expose metadata at other addresses (e.g., `fd00:ec2::254` on AWS). The IPv6 ULA range (`fc00::/7`) is blocked, which covers some of these, but not all.
+
+**File**: `backend/src/services/scraper.rs` (lines 281-316)
+
+### Session Security: Solid Foundation, Missing Features
+
+Strengths:
+- Session tokens are 32 bytes of OS randomness (256 bits of entropy)
+- Only SHA-256 hashes stored in DB (database leak doesn't compromise sessions)
+- HttpOnly, SameSite=Lax cookies
+- CSRF via `X-Requested-With` header check
+
+**Gaps**:
+1. **No session sliding expiration**: The `last_active_at` column exists but is never updated. Sessions expire at a fixed time regardless of activity. This is a design choice, but it means active users get logged out after 30 days.
+2. **No concurrent session limits**: A user can have unlimited active sessions. A stolen session token remains valid for 30 days with no mechanism to detect anomalous session usage.
+3. **No session revocation UI**: Users cannot see or revoke their active sessions.
+4. **Magic link tokens in GET parameters**: The `GET /api/v1/auth/verify?token=...` endpoint puts the magic link token in the URL query string. This token ends up in browser history, server access logs, and potentially Referer headers. The POST endpoint exists as an alternative, but the email link uses GET.
+
+**File**: `backend/src/middleware/auth.rs`, `backend/src/handlers/auth.rs` (line 244)
+
+### Error Message Sanitization: Allowlist Would Be Safer
+
+The `sanitize_error_message` function (line 1291 of `synthesis.rs`) uses a blocklist approach -- checking for known sensitive patterns. This is inherently fragile:
+- What if an error contains `connection refused to internal-host.corp.net:5432`? It would not match any blocklist pattern and would be passed through.
+- The truncation at 200 chars still allows significant information leakage.
+
+A safer approach would be an allowlist of known-safe error patterns, defaulting to a generic message for anything unrecognized.
+
+---
+
+## Operational Risks
+
+### What Will Wake You at 3am
+
+#### 1. Database Fills Up (No Monitoring or Alerts)
+
+The Docker Compose configuration does not set any PostgreSQL `max_connections`, `shared_buffers`, or storage limits. The `postgres_data` volume grows without bound. The `shm_size: 128mb` is set, but there are no alerts on disk usage.
+
+**Risk scenario**: The `llm_call_log` table (storing full LLM prompts and responses as TEXT) grows past available disk. PostgreSQL starts rejecting writes. All generation jobs fail. The error appears as a generic "Internal error" to users.
+
+**File**: `docker-compose.yml`
+
+#### 2. Hung Generation Jobs Block Users
+
+If an LLM API call hangs (the `reqwest` timeout is 120 seconds, but the overall pipeline has no timeout), the user's generation slot is occupied. The `has_active_job` check prevents starting a new one. The only escape is the 1-hour job TTL, but during that hour the user is stuck.
+
+There is no admin endpoint to cancel a stuck job, and no watchdog to kill jobs that exceed a reasonable duration.
+
+**File**: `backend/src/services/synthesis.rs` (line 88, `JOB_TTL`)
+
+#### 3. Rate Limit Bypass on Restart
+
+All rate limiting state is in-memory. A server restart (or deployment) resets all rate limit counters. An attacker who discovers this can time their abuse to coincide with deployments.
+
+The `ProviderRateLimiter::reload_from_db` also resets all timestamps (line 182-191), meaning an admin updating rate limits for any provider resets the sliding window for that provider.
+
+**File**: `backend/src/services/rate_limiter.rs` (lines 182-191)
+
+#### 4. No Database Backups Configured
+
+The Docker Compose configuration provides no backup mechanism for the PostgreSQL volume. There is no `pg_dump` cron job, no WAL archiving, no replication.
+
+**Risk scenario**: The Docker volume is lost due to host failure. All user data, syntheses, encrypted API keys, and settings are permanently gone.
+
+---
+
+## UX Failure Modes
+
+### 1. Generation Takes 10+ Minutes
+
+With 10 sources, 15 links each, batch size 5, and LLM classification per article, the pipeline can make 30+ LLM calls (with rate limiting waits between batches). At 5-10 seconds per LLM call plus rate limit waits, a generation can easily take 5-15 minutes.
+
+The SSE progress stream provides percentage updates, but the percentages are approximate (hardcoded ranges: 25-65% for Phase 1 processing, 70-90% for Phase 2). A user seeing "Articles 3-5/87..." at 27% for several minutes may assume it is stuck.
+
+There is no estimated time remaining, no cancel button functionality, and if the browser tab is closed and reopened, the SSE reconnection has only 3 retries with exponential backoff (1s, 2s, 4s) before giving up permanently.
+
+### 2. All Sources Down
+
+If all user sources return errors during Phase 1, the pipeline silently moves to Phase 2 (web search fallback). If the user has no web search configured (no Brave key, LLM search disabled), the pipeline reaches the "no valid articles found" error at line 848. The error message ("Aucun article valide trouve. Verifiez vos sources et categories.") does not distinguish between "your sources are all down" and "we found articles but they were all filtered out."
+
+### 3. Error Messages Are in French Only
+
+All user-facing error messages in the backend are hardcoded in French. The CLAUDE.md mentions "i18n-ready" with French translations in the frontend, but the backend error messages bypass i18n entirely. A non-French-speaking user would see opaque French error messages.
+
+**Examples** from `synthesis.rs`: "Limite de requetes atteinte", "Aucun article valide trouve", "Erreur d'authentification avec le fournisseur IA."
+
+---
+
+## Edge Cases & Race Conditions
+
+### 1. Race Condition in Job Creation
+
+The `create_job` method (line 107 of `synthesis.rs`) iterates all entries in the `DashMap` to check if the user already has an active job, then inserts a new entry. This is not atomic. Two concurrent requests could both pass the check and create two jobs for the same user. The `trigger_generate` handler has a separate `has_active_job` check before `create_job`, making it doubly racy -- two requests could both pass `has_active_job`, then both call `create_job`.
+
+**Risk scenario**: User double-clicks the "Generate" button rapidly. Two generation jobs start simultaneously, consuming double the LLM API quota and potentially producing duplicate syntheses.
+
+**File**: `backend/src/handlers/generation.rs` (lines 43-62)
+
+### 2. Settings Race with Generation
+
+A user can update their settings while a generation is in progress. The generation pipeline reads settings once at the start (line 251). If the user changes their provider or model mid-generation, the change does not affect the running job (which is correct), but the user might be confused about which settings were used.
+
+More problematically, if the user deletes their API key while a generation is in progress, the generation will fail at the next LLM call with a decryption or not-found error.
+
+### 3. URL Normalization Edge Cases
+
+The `normalize_article_url` function (line 1069) strips tracking parameters and trailing slashes but does not handle:
+- **URL encoding differences**: `https://example.com/caf%C3%A9` vs `https://example.com/cafe` would be treated as different articles.
+- **www vs non-www**: `https://www.example.com/article` vs `https://example.com/article` are treated as different URLs.
+- **Port normalization**: `https://example.com:443/article` vs `https://example.com/article`.
+- **Case sensitivity in paths**: While the final `.to_lowercase()` handles this, some servers treat paths as case-sensitive.
+
+**File**: `backend/src/services/synthesis.rs` (lines 1069-1107)
+
+### 4. `sanitize_error_message` Truncation is Byte-Unsafe
+
+Line 1313: `&msg[..200]` slices at byte position 200. If the message contains multi-byte UTF-8 characters (common in French: accented characters), this will panic at runtime if byte 200 falls in the middle of a multi-byte character.
+
+**File**: `backend/src/services/synthesis.rs` (line 1313)
+
+**Risk scenario**: An LLM returns an error message containing accented characters. The error sanitization panics, crashing the generation task. The job is left in a "progress" state forever (until TTL cleanup).
+
+---
+
+## The Uncomfortable Questions
+
+### 1. What Happens When a User's API Key Is Compromised?
+
+If a user's LLM API key is compromised (e.g., database breach, even though keys are encrypted), there is no mechanism to:
+- Notify the user
+- Force re-encryption with a new master key
+- Audit which keys were decrypted and when
+- Rotate the master encryption key without downtime
+
+### 2. Is the Per-Article LLM Classification Sustainable?
+
+Each article in the pipeline requires an individual LLM call for classification and summarization. With 20 categories x 4 items = 80 target articles, and typical scrape success rates around 50%, the pipeline may attempt 160+ LLM calls per generation. At $0.01-0.10 per call, a single generation could cost $1.60-$16.00. Is the user aware of this cost?
+
+### 3. Why Is `SESSION_SECRET` Required but Never Used?
+
+The `SESSION_SECRET` environment variable is validated at startup (must be 64+ characters) but is never referenced in the actual session creation or verification code. Sessions use cryptographically random tokens with SHA-256 hashing -- the session secret is not used as an HMAC key or for any other purpose. It appears to be a vestigial requirement.
+
+**File**: `backend/src/config.rs` (line 15), search for `session_secret` usage outside config shows no references in auth or session code.
+
+### 4. What If the Article History Uniqueness Assumption Is Wrong?
+
+The migration `20260324000016_enrich_article_history.sql` drops the unique index on `(user_id, url_hash)` and replaces it with a non-unique index. This means the same URL can appear multiple times in the history table (e.g., once as "filtered_history", once as "used"). The `check_urls_exist` function returns a `HashSet` of matching hashes, so duplicates don't cause functional issues, but the table grows faster than expected and the ON CONFLICT DO NOTHING in `insert_urls` (which still exists) will never trigger since there is no unique constraint.
+
+### 5. Is the "Single-Tenant" Assumption Actually Enforced?
+
+The CLAUDE.md says "Single-tenant self-hosted (one instance per deployment)" but nothing in the code enforces this. Multiple users can register, each with their own settings and API keys. The rate limiting is global per-provider (not per-user by default). If two users generate simultaneously using the same LLM provider, they share the same rate limit bucket -- one user's generation can starve the other.
+
+---
+
+## Prioritized Risk Mitigation
+
+### P0 -- Fix Before Next Release
+
+1. **UTF-8 truncation panic in `sanitize_error_message`** (line 1313 of `synthesis.rs`): Replace `&msg[..200]` with a char-boundary-safe truncation. This is a runtime crash waiting to happen.
+
+2. **Race condition in job creation**: Make `has_active_job` + `create_job` atomic by using a single `DashMap` operation or a mutex. Otherwise double-generation is possible on double-click.
+
+### P1 -- Fix Within 30 Days
+
+3. **Add expired session cleanup**: Either a background tokio task that runs `sessions::delete_expired` periodically (e.g., every hour), or a database-level scheduled function. The function exists but is never called.
+
+4. **Add generation pipeline timeout**: Wrap the `run_generation_inner` call with `tokio::time::timeout(Duration::from_secs(600), ...)` to ensure no job runs indefinitely.
+
+5. **Add database backup strategy**: At minimum, add a `pg_dump` cron job or document a backup procedure. The current setup has zero data redundancy.
+
+### P2 -- Fix Within 90 Days
+
+6. **Stop storing master_encryption_key as a Clone-able String**: Either store it in an `Arc<MasterKey>` in `AppState` (created once at startup) instead of reconstructing it per-operation, or at minimum stop deriving `Clone` on `AppConfig` and use `Arc<AppConfig>` instead.
+
+7. **Add database growth monitoring**: Track table sizes for `article_history`, `llm_call_log`, and `syntheses`. Add cleanup policies or admin-facing growth dashboards.
+
+8. **Implement admin job management**: Add an admin endpoint to list active jobs and cancel stuck ones.
+
+9. **Address IPv4-mapped IPv6 in SSRF checks**: Add explicit checks for `::ffff:` prefixed private IPv4 addresses.
+
+### P3 -- Address Eventually
+
+10. **Deduplicate pipeline logic**: Extract the scrape-classify-summarize-filter pattern into a shared function called by both Phase 1, Phase 2 Brave, and Phase 2 LLM paths.
+
+11. **Add master key rotation tooling**: A CLI command that decrypts all keys with the old master key and re-encrypts with a new one.
+
+12. **Evaluate whether `SESSION_SECRET` is actually needed**: If it serves no purpose, remove it to reduce configuration burden. If it was intended for HMAC-signed sessions, either implement that or document why the current approach is preferred.
+
+13. **Add user-facing cost estimation**: Before starting generation, estimate the number of LLM calls and display an approximate cost based on the selected provider's pricing.
--- a/docs/audit/frontend-report.md
+++ b/docs/audit/frontend-report.md
@ -0,0 +1,279 @@
+# Frontend Audit Report (SolidJS)
+
+**Date:** 2026-03-26
+**Scope:** `frontend/src/` -- SolidJS + Tailwind CSS v4 application
+**Files reviewed:** 30+ files across pages, components, API modules, utils, i18n, and types
+
+---
+
+## Executive Summary
+
+The frontend is well-structured for a learning project of this scope. It demonstrates solid understanding of SolidJS fundamentals: lazy loading, proper `Show`/`For` usage, context-based auth, typed i18n, and a clean API client layer. The codebase is consistent in style and convention.
+
+That said, several structural issues deserve attention. The Settings page has grown to 1026 lines and mixes concerns that should be decomposed. Some `setTimeout` calls lack `onCleanup` guards, creating potential memory leaks. A handful of SolidJS-specific anti-patterns (destructured reactive props, `children: any`) reduce type safety and could cause subtle bugs. The i18n system, while type-safe at the key level, has an XSS surface via `innerHTML` usage in one location. None of these are critical in a single-tenant self-hosted deployment, but they represent the most impactful areas for improvement.
+
+**Overall quality: Good.** The top priorities are decomposing the Settings page, adding missing `onCleanup` guards, and fixing the `innerHTML` usage.
+
+---
+
+## SolidJS Patterns
+
+### Reactive Primitives
+
+**Signals and derived state -- well done.** The AuthContext correctly models derived signals (`isAuthenticated`, `isAdmin`) as plain functions over `user()`, avoiding redundant signals. The `createResource` usage in ApiKeyManager and Settings for provider/key data is appropriate.
+
+**`createEffect` usage -- mostly correct, one concern.** Effects in GenerateSynthesis for auto-redirect and error handling are clean. However, the effect in Settings.tsx (line 81-93) that checks provider availability reads both `providers()` and `settings()`, which means it will re-fire whenever either signal changes -- including when the user changes an unrelated setting field. This is harmless (idempotent) but wasteful. Using `on()` to track only `providers` would be more precise:
+
+```ts
+// Current: re-fires on any settings change
+createEffect(() => {
+  const providerList = providers();
+  const currentSettings = settings();  // unintended dependency
+  ...
+});
+
+// Better: only fires when providers load
+createEffect(on(() => providers(), (providerList) => { ... }));
+```
+
+**`createResource` vs `onMount` inconsistency.** Some data fetching uses `createResource` (ApiKeyManager, Settings providers), while most pages use `onMount` + manual signal (Home, Sources, LlmLogs, GenerateSynthesis). This is not wrong, but `createResource` would provide free loading/error states and refetch capabilities. The inconsistency suggests no team-wide convention was established.
+
+### Component Composition
+
+**`Show` and `For` -- correctly used throughout.** The callback form of `Show` (e.g., `<Show when={user()}>{(currentUser) => ...}`) is used in Navbar and MobileMenu, which is the correct SolidJS pattern for narrowing nullable values. `For` is used for all list rendering, never `.map()`.
+
+**Step status rendering in GenerateSynthesis -- could use `Switch/Match`.** Lines 335-357 use three consecutive `<Show>` blocks for the three possible step statuses. This is semantically an exclusive match and would be clearer as:
+
+```tsx
+<Switch>
+  <Match when={status() === 'done'}>
+    <CheckCircle ... />
+  </Match>
+  <Match when={status() === 'in-progress'}>
+    <Loader2 ... />
+  </Match>
+  <Match when={status() === 'pending'}>
+    <Circle ... />
+  </Match>
+</Switch>
+```
+
+The current approach renders correctly because only one condition is true at a time, but `Switch/Match` makes the mutual exclusion explicit and avoids evaluating all three conditions.
+
+### Performance Concerns
+
+**`completedSteps()` in GenerateSynthesis is computationally expensive per call.** This derived function (lines 126-160) iterates over all events and performs multiple `findIndex` calls on every render triggered by new SSE events. For the typical 4-step pipeline this is negligible, but the algorithmic complexity is O(steps * events). If the event stream grew large, this would become visible. Consider memoizing with `createMemo`.
+
+**`For` with reactive lookup in ApiKeyManager.** The `getKeyForProvider` function (line 33) scans the `apiKeys()` resource on every call. Since it is called inside the `For` loop's render function, it re-evaluates whenever `apiKeys` changes, which is correct. However, the returned value is not a signal -- it is a plain value computed from reactive data accessed during render. This works in SolidJS because `For` tracks the parent signal, but it means the entire provider list re-renders when any key changes. For 3 providers this is negligible.
+
+**No unnecessary re-renders detected.** The project correctly avoids the React-style anti-pattern of destructuring signals at the component top level. Signal reads are deferred to JSX expressions (e.g., `{settings().theme}` rather than `const theme = settings().theme` outside JSX).
+
+---
+
+## Component Architecture
+
+### Oversized Components
+
+**Settings.tsx (1026 lines) is the primary concern.** This single component handles:
+- Settings form (theme, age, items, articles, batch size, history days)
+- Provider selection with auto-detection and warning
+- Brave Search key management (save, test, delete, toggle)
+- Search agent behavior textarea
+- Category list with add/remove
+- Rate limit configuration
+- JSON export/import with optional API key inclusion
+- API key manager delegation
+
+This should be decomposed into at least 5 sub-components:
+1. `SettingsForm` -- basic numeric/text fields
+2. `ProviderSelector` -- provider dropdown, model dropdowns, web search badge
+3. `BraveSearchSection` -- key management + toggle (this is essentially a duplicate of ProviderKeyCard logic)
+4. `CategoryManager` -- category list CRUD
+5. `RateLimitSection` -- rate limit inputs and reset
+
+The Brave Search key management (lines 57-63, 134-179, 582-678) duplicates the same save/test/delete pattern already implemented in `ProviderKeyCard` within `ApiKeyManager.tsx`. This is a clear DRY violation -- Brave Search should be treated as another provider key card.
+
+**Other pages are appropriately sized.** Home (240 lines), GenerateSynthesis (404 lines), Sources (~300 lines), and LlmLogs (159 lines) are all reasonable for their complexity.
+
+### State Management
+
+**Auth context is clean and minimal.** It provides the right abstraction surface (`user`, `loading`, `isAuthenticated`, `isAdmin`, `logout`, `refreshUser`). The `refreshUser` function is available but appears unused in the current codebase -- verify if this is needed.
+
+**Toast context is well-implemented.** The module-level `activeTimers` Map for managing auto-dismiss timers is pragmatic. The `aria-live="polite"` on the toast container is a good accessibility touch.
+
+**No global state beyond auth and toast.** Settings and other data are fetched per-page via `onMount`. This is appropriate for the application's complexity level -- introducing a global store would be premature.
+
+**The `deleteTimers` signal in Home.tsx stores `setTimeout` handles in a reactive signal.** This works but is unusual. The timers are stored so they can be cleared on the second (confirming) click. However, these timers are never cleaned up on component unmount -- if the user navigates away while a delete confirmation is pending, the timeout callback will still fire and call `setDeletingId` on an unmounted component. This should use `onCleanup`.
+
+### Reusability
+
+**`Button` component exists but is rarely used.** A well-typed `Button` component with variant support (`primary`, `secondary`, `danger`), loading state, and icon slot lives in `components/ui/Button.tsx`. However, across the pages audited, buttons are overwhelmingly created with inline Tailwind classes rather than using this component. This leads to significant class string duplication. For example, the indigo primary button pattern appears in at least 12 places with minor variations.
+
+**`LoadingSpinner` is consistently reused** across all pages -- good.
+
+**`ProviderKeyCard` is a good reusable pattern** but its logic is duplicated for Brave Search in Settings.tsx as noted above.
+
+**No shared form input components.** Every input field repeats the full Tailwind class string: `"shadow-sm focus:ring-indigo-500 focus:border-indigo-500 block w-full sm:text-sm border-gray-300 rounded-md py-2 px-3 border"`. A `TextInput`, `NumberInput`, and `Select` component would reduce duplication and ensure visual consistency.
+
+---
+
+## Type Safety
+
+**Overall: strong.** The codebase makes effective use of TypeScript across all layers.
+
+**Strengths:**
+- All API request/response types are defined in `types.ts` with proper interfaces
+- The `isApiError` type guard is used consistently for error handling
+- The i18n system uses `TranslationKey` (derived from `keyof typeof fr`) to make translation keys type-checked at compile time
+- `satisfies` is used in the API client for throw expressions (line 60 of client.ts) and API calls (syntheses.ts line 86)
+- Props interfaces are defined for all components
+
+**Issues:**
+
+1. **`children: any` in Layout and AdminLayout.** Both layout components type their children prop as `any` instead of using `ParentComponent` or `JSX.Element`. This loses type safety on what can be passed as children. The fix is trivial: use `ParentComponent` as done in App.tsx's `ProtectedLayout`.
+
+2. **Non-null assertion in ApiKeyManager.** Line 249 uses `props.apiKey!.key_prefix` inside a `<Show when={isConfigured()}>` block. While logically safe (the Show guard ensures the value exists), the compiler cannot verify this. Using `Show`'s callback form (`{(key) => key().key_prefix}`) would eliminate the assertion.
+
+3. **`ApiClient.request` returns `undefined as T` for 204 responses (line 76 of client.ts).** This is a type lie -- the caller may not expect `undefined`. The methods that return `void` (like `delete`) work correctly because `undefined` satisfies `void`, but if a caller expected a non-void return from a 204, it would fail at runtime. Consider constraining 204-returning methods to `Promise<void>` at the call site.
+
+4. **`isApiError` type guard is loose.** It checks for the presence of `status` and `message` properties but does not verify their types. Any object with these two keys would pass. Adding `typeof error.status === 'number'` would tighten it.
+
+5. **`DEFAULT_SETTINGS` contains hardcoded French text** (category names, search behavior). This is a content/i18n boundary violation -- if the app were ever localized, these defaults would need to come from the i18n system.
+
+---
+
+## Code Organization
+
+### File Structure -- Good
+
+The project follows a clear convention:
+- `pages/` for route-level components
+- `components/` for shared UI
+- `api/` for HTTP communication
+- `utils/` for pure helpers
+- `contexts/` for global state
+- `i18n/` for translations
+- `types.ts` as the single type definition file
+
+This separation is clean and navigable.
+
+### Import Patterns -- Consistent
+
+All imports use the `~/` path alias (configured in vite.config.ts), avoiding fragile relative paths. Imports are grouped logically (solid-js, router, icons, local modules) though not enforced by a linter rule.
+
+### Separation of Concerns -- Mostly Good
+
+**API layer is well-isolated.** The `client.ts` singleton handles all HTTP concerns (CSRF headers, credentials, 401 redirect, error parsing). Individual API modules (`auth.ts`, `settings.ts`, etc.) are thin wrappers that only map routes to typed calls. This is excellent.
+
+**Business logic leaks into pages.** The Settings page contains import/export logic (JSON serialization, file download, file parsing), Brave Search key management, and category validation. These should be extracted into utility functions or a settings service module.
+
+**The `syntheses.ts` API module contains `triggerDownload` and `fetchFile` utility functions** that are not synthesis-specific. These should live in a shared `utils/download.ts` module so other features (e.g., CSV export in Sources) could reuse them.
+
+### Missing Error Boundaries Per-Route
+
+The application has a single top-level `ErrorBoundary` wrapping the entire router. An error in any page will replace the entire app with the error screen. Consider wrapping each route's content with a more localized error boundary that preserves the navigation bar.
+
+---
+
+## UI/UX Patterns
+
+### Error Handling -- Consistent But Could Be Unified
+
+Every page implements its own error display pattern with slight variations:
+- Home: red banner with `border border-red-200`
+- GenerateSynthesis: red banner with `border-l-4 border-red-400`
+- LlmLogs: red box with `border border-red-200`
+
+A shared `ErrorAlert` component would standardize these.
+
+Error handling across the codebase consistently uses the `isApiError` type guard to surface server messages or fall back to i18n strings. This is a good pattern.
+
+**`console.error` usage in Sources.tsx** (lines 103, 174) and ArticleHistory.tsx (line 104) should be replaced with user-visible error feedback (toast or inline message). Errors silently logged to the console are invisible to the user.
+
+### Loading States -- Good
+
+All pages that fetch data show `<LoadingSpinner />` during loading via `<Show when={!loading()} fallback={<LoadingSpinner />}>`. The spinner supports `fullPage` and `size` variants.
+
+### Form Handling
+
+**No form validation library.** Settings and Sources pages implement validation inline. This is acceptable at current scale but becomes harder to maintain as forms grow. The Settings page in particular has no client-side validation -- invalid values (e.g., negative `max_age_days`) are sent to the server.
+
+**Controlled inputs use `value` + `onInput`.** This is the correct SolidJS pattern (not `onChange` which fires on blur in some browsers). Numeric inputs use `parseInt(...) || defaultValue` for fallback, which is pragmatic but means a user typing "0" gets the default instead.
+
+**The import flow in Settings does not auto-save.** After importing a JSON file, the user sees a success message saying "N'oubliez pas d'enregistrer" (don't forget to save). This is a potential UX pitfall -- users may import and navigate away, losing changes.
+
+### Accessibility
+
+**Good:**
+- `aria-label` on icon-only buttons (mobile menu toggle, show/hide key)
+- `aria-live="polite"` on toast container
+- `aria-hidden="true"` on decorative icons
+- `role="alert"` on toast items
+- Semantic HTML (`<nav>`, `<main>`, `<label for="...">`)
+- Focus ring styles on all interactive elements
+
+**Needs attention:**
+- The `<details>` elements in LlmLogs use `list-none` to hide the default marker but provide no accessible name or ARIA attributes.
+- The delete confirmation pattern (click once to arm, click again to confirm) has no visible text change on Home page cards -- only a color shift and a tooltip change. Users relying on screen readers would miss this state change.
+- No skip-to-content link.
+- Color alone is used to distinguish toast types (success=green, error=red, info=blue). The icons provide a secondary cue, which is good, but there is no text prefix like "Error: " or "Success: ".
+
+### Security Concern: innerHTML Usage
+
+**GenerateSynthesis.tsx line 249** uses `innerHTML` to render a translated string:
+
+```tsx
+<p innerHTML={t('generate.description', { days: ..., theme: ... })} />
+```
+
+The `theme` value comes from `settings().theme`, which is user-controlled content stored in the database. If a user sets their theme to `<img src=x onerror=alert(1)>`, this would execute arbitrary JavaScript. In a single-tenant self-hosted app this is self-XSS (the user can only attack themselves), but it is still a bad practice. Replace `innerHTML` with safe text interpolation, or sanitize the value.
+
+---
+
+## Prioritized Recommendations
+
+### Priority 1 -- Fix Bugs and Security Issues
+
+1. **Remove `innerHTML` usage in GenerateSynthesis.tsx.** Replace with safe text rendering. The translation string uses `{days}` and `{theme}` placeholders -- restructure it to use separate `<span>` elements or a template that does not require HTML parsing.
+
+2. **Add `onCleanup` for `setTimeout` in Home.tsx.** The `deleteTimers` signal stores timeout handles that are never cleared on unmount. Add an `onCleanup` that iterates and clears all active timers.
+
+3. **Add `onCleanup` for `setTimeout` in ArticleHistory.tsx** (line 95). The 3-second confirmation timeout is not cleaned up on unmount.
+
+4. **Add `onCleanup` for `setTimeout` in SynthesisDetail.tsx** (line 150). The 5-second email success timeout is not cleaned up on unmount.
+
+### Priority 2 -- Structural Improvements
+
+5. **Decompose Settings.tsx** into sub-components (ProviderSelector, BraveSearchSection, CategoryManager, RateLimitSection). Target: each sub-component under 200 lines.
+
+6. **Eliminate Brave Search key management duplication** by reusing `ProviderKeyCard` from ApiKeyManager.tsx or extracting the shared pattern.
+
+7. **Replace `children: any` with `ParentComponent`** in Layout.tsx and AdminLayout.tsx.
+
+8. **Use the existing `Button` component** more broadly, or remove it if the team prefers inline styling. Current inconsistency is confusing.
+
+### Priority 3 -- Code Quality
+
+9. **Standardize data fetching.** Pick either `createResource` or `onMount` + signals and apply consistently. `createResource` is recommended for simple GET-on-mount patterns.
+
+10. **Extract shared form input components** (`TextInput`, `NumberInput`, `Select`) to reduce Tailwind class duplication.
+
+11. **Create a shared `ErrorAlert` component** for consistent error display across pages.
+
+12. **Replace `console.error` calls** in Sources.tsx and ArticleHistory.tsx with user-visible feedback (toast or inline error).
+
+13. **Use `Switch/Match`** instead of multiple `<Show>` blocks for mutually exclusive conditions (GenerateSynthesis step icons).
+
+### Priority 4 -- Polish
+
+14. **Move `triggerDownload` and `fetchFile`** from `api/syntheses.ts` to a shared utility module.
+
+15. **Tighten the `isApiError` type guard** to check `typeof status === 'number'`.
+
+16. **Extract business logic** from Settings.tsx (JSON export/import, category validation) into dedicated utility functions.
+
+17. **Consider route-level error boundaries** to preserve the navbar on page errors.
+
+18. **Add i18n-aware defaults** for `DEFAULT_SETTINGS` so category names and search behavior are not hardcoded in French.
+
+19. **Narrow the `createEffect` in Settings.tsx** (line 81) to track only `providers()` using `on()`.
--- a/docs/audit/rust-expert-report.md
+++ b/docs/audit/rust-expert-report.md
@ -0,0 +1,309 @@
+# Rust Backend Audit Report
+
+**Date:** 2026-03-26
+**Scope:** `backend/src/` -- all services, handlers, middleware, models, DB layer, and integration tests.
+**Rust edition:** 2021 | **Key crates:** axum 0.8, sqlx 0.8, tokio 1, reqwest 0.12, aes-gcm 0.10, dashmap 6
+
+---
+
+## Executive Summary
+
+The codebase is remarkably well-structured for a learning project. It demonstrates strong idiomatic Rust across most dimensions: proper error propagation with `?`, thoughtful enum design (`AppError`), `Arc`-based shared state, and zero `unsafe` code. The architecture cleanly separates concerns (handlers / services / db / models), and the test suite covers all critical paths.
+
+That said, the audit identifies **three categories of findings**:
+
+1. **Code quality issues** (moderate priority): Significant code duplication in the synthesis pipeline, a potential panic in `sanitize_error_message`, and unnecessary allocations in hot paths.
+2. **Design opportunities** (lower priority): Missing newtype patterns, opportunities to reduce `Clone` overhead via `Arc<str>`, and some functions with too many parameters that would benefit from builder/struct patterns.
+3. **Concurrency subtlety** (low probability, worth noting): A TOCTOU race in `JobStore::create_job` between checking for existing jobs and inserting a new one.
+
+No critical security issues were found. SQL injection is comprehensively prevented via sqlx parameterized queries. Secrets are never exposed in error responses. SSRF protection is thorough with a documented residual risk.
+
+---
+
+## Idiomatic Rust
+
+### Ownership & Borrowing
+
+**Strengths:**
+- `MasterKey` intentionally omits `Clone` to prevent key material duplication -- an excellent security-conscious design (`encryption.rs:18`).
+- Functions consistently accept `&str` / `&[u8]` over owned types where possible (e.g., `encrypt(master_key: &MasterKey, plaintext: &str)`).
+- The `AuthUser` extractor cleanly borrows from request parts and clones only the necessary fields.
+
+**Findings:**
+
+1. **`AppConfig` holds secrets as `String` and derives `Clone`** (`config.rs:9-10`). Every `AppState` clone copies the `master_encryption_key` and `session_secret` in full. Wrapping `AppConfig` in `Arc<AppConfig>` would eliminate these copies and reduce the blast radius of the key material in memory.
+
+2. **`extract_session_token` allocates a `format!` string on every cookie segment** (`middleware/auth.rs:40-41`). The two `format!("{}=", cookie_name)` calls create heap-allocated strings for every iteration. These could be computed once before the iterator chain:
+   ```rust
+   let prefix = format!("{}=", cookie_name);
+   cookie_header.split(';').map(|s| s.trim())
+       .find(|s| s.starts_with(&prefix))
+       .and_then(|s| s.strip_prefix(&prefix))
+       .map(|s| s.to_string())
+   ```
+
+3. **Excessive cloning in synthesis pipeline** (`synthesis.rs`, ~59 `.clone()` calls). Many of these are necessary for spawned tasks, but several could be avoided:
+   - `model_research.clone()` is cloned into every spawned task. Wrapping it in `Arc<str>` would make clones zero-cost.
+   - `classification_categories.clone()` clones a `Vec<String>` per batch. `Arc<Vec<String>>` would be preferable.
+   - `classify_schema.clone()` clones a `serde_json::Value` per article. As an immutable value, this should be `Arc<Value>`.
+
+### Error Handling
+
+**Strengths:**
+- `AppError` is a clean, well-designed enum with `thiserror` derivation and `IntoResponse` implementation.
+- Internal errors are properly logged but never exposed to clients (`errors.rs:66-73`).
+- The `From<sqlx::Error>` and `From<anyhow::Error>` conversions are correct and idiomatic.
+- `.ok()` is used appropriately for fire-and-forget operations like logging (`synthesis.rs:953`).
+
+**Findings:**
+
+4. **`msg[..200]` in `sanitize_error_message` can panic on non-ASCII** (`synthesis.rs:1313`). If the error message contains multi-byte UTF-8 characters and the 200th byte falls in the middle of a character, this will panic at runtime. Fix: use `msg.chars().take(200).collect::<String>()` or `msg.char_indices()` to find a safe boundary.
+
+5. **`Selector::parse("a[href]").unwrap()` in production code** (`source_scraper.rs:73,130`). While `Selector::parse` with a static, known-valid selector string will never fail in practice, it would be more idiomatic to use `lazy_static!` or `std::sync::LazyLock` to parse the selector once and reuse it. This also avoids repeated parsing overhead.
+
+6. **`unwrap_or("Unknown error")` in error mappers** (`gemini.rs:156`, `openai.rs:163`, `anthropic.rs:199`). These are safe because the value is only used for logging, but using `unwrap_or_default()` would be slightly more consistent with the rest of the codebase.
+
+### Iterator Patterns
+
+**Strengths:**
+- Excellent use of iterator chains throughout: `filter_map`, `find_map`, `chain`, `collect`, `take`, `any`, `position`.
+- `extract_links_from_html` and `extract_links_as_pairs` are clean, functional-style implementations.
+- `sanitize_json_null_bytes` is an elegant recursive function using `into_iter().map().collect()`.
+
+**Findings:**
+
+7. **`insert_urls` in `article_history.rs:78-87` executes individual INSERT queries in a loop.** For bulk inserts, this should use a single query with `UNNEST` or build a multi-row INSERT, which would be both faster and more idiomatic:
+   ```rust
+   // Instead of N individual queries:
+   sqlx::query("INSERT INTO ... SELECT * FROM UNNEST($1::uuid[], $2::text[], $3::text[]) ...")
+   ```
+
+### Trait Usage
+
+**Strengths:**
+- `LlmProvider` trait (`llm/mod.rs:22-39`) is well-designed: `async_trait` is correctly applied, `Send + Sync` bounds enable use behind `Arc<dyn LlmProvider>`.
+- `FromRequestParts` implementation for `AuthUser` and `AdminUser` is textbook Axum usage.
+- `TryFrom<SettingsRow> for UserSettings` cleanly separates DB row types from domain types.
+
+**Findings:**
+
+8. **`async_trait` could be removed in favor of native async traits.** Since Rust 1.75+ (stable), async traits work natively for `dyn`-compatible traits. The `async_trait` crate adds hidden `Box::pin` allocations on every call. However, the current edition is 2021, and `async_trait` is still the pragmatic choice for trait objects in some contexts. Worth revisiting when upgrading to edition 2024.
+
+---
+
+## Async Patterns
+
+### Concurrency
+
+**Strengths:**
+- `JoinSet` is used correctly for bounded concurrent scraping (`synthesis.rs:296-372`). The manual "seed initial + spawn on completion" pattern correctly limits concurrency to 5.
+- `tokio::sync::watch` is an excellent choice for SSE progress reporting -- exactly the right channel type for "latest value" semantics.
+- The `_rx` field in `JobEntry` (`synthesis.rs:71`) is a smart pattern: keeping a receiver alive prevents `send()` from returning an error.
+
+**Findings:**
+
+9. **TOCTOU race in `JobStore::create_job`** (`synthesis.rs:108-145`). The function iterates all entries to check for existing active jobs, then inserts a new entry. Between the check and the insert, another thread could create a job for the same user. While unlikely (generation is user-initiated and the check in `trigger_generate` provides a first-line defense), the `DashMap` entry API could enforce atomicity:
+   ```rust
+   // Atomic check-and-insert would prevent the race
+   ```
+   In practice this is low-risk because the handler also checks `has_active_job` before calling `create_job`, and the window is microseconds.
+
+10. **Spawned tasks silently swallow panics** (`generation.rs:73-75`). The `tokio::spawn` for `run_generation` does not handle `JoinError::Panic`. If the generation pipeline panics (e.g., due to finding #4 above), the spawned task silently dies, and the SSE stream never sends a completion/error event. The client would hang indefinitely. Consider:
+    ```rust
+    tokio::spawn(async move {
+        let result = tokio::spawn(synthesis::run_generation(...)).await;
+        if let Err(join_error) = result {
+            tx.send(ProgressEvent::Error { message: "Internal error".into() }).ok();
+        }
+    });
+    ```
+
+### Error Propagation in Spawned Tasks
+
+**Strengths:**
+- `run_generation` wraps `run_generation_inner` cleanly, converting `Result` into progress events (`synthesis.rs:213-230`).
+- Error sanitization prevents API key leakage in SSE error events (`synthesis.rs:1291-1317`).
+
+**Findings:**
+
+11. **`check_rate_limit` uses a busy-wait retry loop with `tokio::time::sleep`** (`synthesis.rs:1029-1055`). While the sleep prevents CPU spinning, the `check()` call *consumes* a rate limit slot even when the request is denied. On each loop iteration, `check()` is called, which records a timestamp if allowed but also modifies state. This means that repeated denied checks do not consume slots (only successful ones do), so this is actually correct -- but the code comment could be clearer about why `check()` is safe to call in a loop.
+
+### Channel Usage
+
+**Strengths:**
+- `watch::channel` is the ideal choice for progress events: multiple subscribers, always the latest value, no backpressure needed.
+- `WatchStream` adapter in the SSE handler (`generation.rs:113`) correctly bridges the watch channel to an SSE stream.
+- Keepalive pings every 15s prevent proxy timeouts (`generation.rs:126`).
+
+---
+
+## Type System
+
+### Missing Newtype Opportunities
+
+12. **User IDs, Job IDs, and Synthesis IDs are all `Uuid`** throughout the codebase. A newtype pattern would prevent accidentally passing a `user_id` where a `job_id` is expected:
+    ```rust
+    pub struct UserId(pub Uuid);
+    pub struct JobId(pub Uuid);
+    pub struct SynthesisId(pub Uuid);
+    ```
+    The `trace_article` function takes 3 `Uuid` parameters (`synthesis.rs:927-930`), making it easy to swap `user_id` and `job_id` silently.
+
+13. **Provider names are bare `String`/`&str`** everywhere. A `ProviderName` enum (`Gemini | OpenAI | Anthropic`) would eliminate the catch-all `_ =>` branches in match statements and make invalid provider names a compile-time error rather than a runtime error.
+
+### Type Safety Gaps
+
+14. **`RateLimitConfigRow` casts `i32` to `u32`/`u64` without validation** (`rate_limiter.rs:188-189`):
+    ```rust
+    max_requests: row.max_requests as u32,
+    time_window: Duration::from_secs(row.time_window_seconds as u64),
+    ```
+    Negative values from the database would silently wrap to large positive values. A `TryFrom` or explicit validation would be safer.
+
+15. **`UserRateLimitEntry::new` casts `i32` to `usize`/`u64`** (`app_state.rs:46-48`) with the same wrapping risk for negative values.
+
+---
+
+## Memory & Performance
+
+### Unnecessary Allocations
+
+16. **`format!("category_{}", i)` is computed repeatedly** in inner loops (`synthesis.rs:514-518, 720-725`). These category keys are deterministic and could be pre-computed once into a `Vec<String>` before the main loop.
+
+17. **`body_text.chars().take(500).collect::<String>()`** (`synthesis.rs:467,676`). This allocates a new string for each article. Using `&body_text[..find_char_boundary(body_text, 500)]` would avoid the allocation entirely (with a helper to find a safe UTF-8 boundary).
+
+18. **`hash_article_url` recomputes the hash multiple times for the same URL** during history filtering (`synthesis.rs:377-386`). The hash is computed once in line 377 for the batch, then again in line 381 for each filtered URL. This is redundant -- the batch result could use a `HashMap<String, String>` (hash -> url) for O(1) lookups.
+
+19. **`Selector::parse` is called per-invocation** in `extract_page_title`, `detect_soft_404`, `detect_canonical_error`, etc. (`scraper.rs:325-410`). These CSS selectors are constants that should be parsed once and cached. Using `std::sync::LazyLock` (stable in Rust 1.80) or `once_cell::sync::Lazy`:
+    ```rust
+    static TITLE_SEL: LazyLock<Selector> = LazyLock::new(|| Selector::parse("title").unwrap());
+    ```
+
+### Clone Efficiency
+
+20. **`AppConfig` is cloned when creating `AppState`** (`main.rs:59`), copying all `String` fields including the `master_encryption_key`. Since `AppConfig` is effectively immutable after startup, wrapping it in `Arc` would be more appropriate.
+
+21. **`settings.categories.clone()` at `synthesis.rs:261`** clones a `Vec<String>` that is only used to build `classification_categories`. This could use a reference or be constructed more efficiently:
+    ```rust
+    let classification_categories: Vec<&str> = settings.categories.iter()
+        .map(|s| s.as_str())
+        .chain(std::iter::once("Autre"))
+        .collect();
+    ```
+    (Though this would require adjusting downstream code that expects owned strings.)
+
+---
+
+## Safety Assessment
+
+### Unwrap Usage
+
+**No `unwrap()` calls exist in production code paths** (non-test, non-`cfg(test)` code), with two exceptions:
+
+22. **`Selector::parse("a[href]").unwrap()`** in `source_scraper.rs:73,130` -- These are static selectors that will never fail. Acceptable but could use `LazyLock` for clarity.
+
+23. **`unwrap_or("")` and `unwrap_or("Unknown error")`** in LLM error mappers -- These are safe fallback patterns.
+
+**`expect()` usage** is limited to:
+- `main.rs:122,128` -- signal handler installation (correct; failure here is unrecoverable)
+- `auth.rs:47,104` -- `TimeDelta::try_minutes/try_days` with constant values (correct; these are compile-time-knowable values)
+
+### Input Validation
+
+**Strengths:**
+- Email validation via the `email_address` crate.
+- Turnstile captcha verification on auth endpoints.
+- CSRF protection via `X-Requested-With` header check.
+- URL scheme validation and SSRF prevention with comprehensive IPv4/IPv6 coverage.
+- API key prefix extraction prevents full key exposure.
+- Body size limits on scraper responses (`MAX_BODY_SIZE = 5MB`).
+
+**Findings:**
+
+24. **No input length validation on LLM prompts.** The `build_search_prompt` and `build_article_classify_prompt` functions accept arbitrary-length category names and theme strings from user settings. An extremely long theme string could lead to excessive LLM token usage. Consider capping `settings.theme` length at the handler/validation layer.
+
+25. **`source_url` is not validated against SSRF before fetching in `source_scraper.rs:41-48`.** The `extract_article_links` function uses the shared `http_client` (which is the scraper client with SSRF-safe redirect handling), but the initial DNS resolution check (`check_ssrf`) is not performed before the GET request. The redirect policy catches private IPs on redirects, but the initial request to a private IP would succeed. The `extract_article_links_with_llm` function has the same issue.
+
+### SQL Safety
+
+**All SQL queries use parameterized bindings via sqlx** (`$1`, `$2`, etc.). No string interpolation is used in any SQL query. The `rate_limiter.rs:173-178` direct query is also properly parameterized.
+
+**sqlx compile-time query checking** is not used (`query_as::<_, Type>(...)` with runtime string queries rather than `sqlx::query!` macro). This is a deliberate choice documented in CLAUDE.md ("runtime-checked queries"), but means SQL errors are caught at runtime rather than compile time.
+
+---
+
+## Testing Assessment
+
+**Strengths:**
+- **Unit tests** are comprehensive across all modules: `errors.rs` (7 tests), `encryption.rs` (8 tests), `rate_limiter.rs` (13 tests), `scraper.rs` (20+ tests), `source_scraper.rs` (12 tests), `synthesis.rs` (10+ tests), all LLM providers (8-10 tests each), `schema.rs` (10 tests), `config.rs` (7 tests).
+- **Integration tests** cover the full HTTP stack: auth flow, CSRF protection, settings, sources, admin, syntheses, API keys, export.
+- **Security-specific tests**: SSRF IP checks (14 tests), soft-404 detection (6+ tests), error message sanitization, internal error hiding.
+- **Edge cases** are well-covered: empty inputs, wrong keys, corrupted data, missing fields.
+
+**Findings:**
+
+26. **No tests for the synthesis generation pipeline itself.** The `run_generation_inner` function -- the most complex code in the project (~700 lines) -- has no unit or integration tests. Its helper functions are tested (`parse_llm_output`, `normalize_article_url`, `rotate_sources`, `sanitize_json_null_bytes`, `sanitize_error_message`), but the pipeline orchestration is untested. This is understandable given the complexity of mocking LLM providers and scraper results, but introducing a trait-based architecture for the scraper and DB layer would enable integration testing.
+
+27. **Integration tests skip silently when `TEST_DATABASE_URL` is not set.** Each test starts with `if !require_test_db() { return; }`. This means CI must explicitly set the env var, and a misconfigured CI pipeline would report all tests as passing. Consider using `#[cfg(feature = "integration")]` or a custom test harness to make the requirement explicit.
+
+28. **No concurrent/stress tests for rate limiters.** The `RateLimiter` and `ProviderRateLimiter` are tested for correctness but not for thread safety under contention. Given they use `DashMap` for lock-free access, a multi-threaded stress test would increase confidence.
+
+29. **Missing negative tests for `create_provider` with empty API key.** The factory tests check unknown providers but not empty or whitespace-only API keys.
+
+---
+
+## Prioritized Recommendations
+
+### High Priority (correctness/safety)
+
+| # | Finding | File | Fix |
+|---|---------|------|-----|
+| 4 | `msg[..200]` panics on multi-byte UTF-8 | `synthesis.rs:1313` | Use `msg.chars().take(200)` or `char_indices()` |
+| 10 | Spawned generation task swallows panics | `generation.rs:73-75` | Wrap in double-spawn or catch `JoinError` |
+| 25 | Missing SSRF check on initial source page fetch | `source_scraper.rs:41` | Call `check_ssrf` before `http_client.get()` |
+
+### Medium Priority (performance/maintainability)
+
+| # | Finding | File | Fix |
+|---|---------|------|-----|
+| 1 | `AppConfig` cloned with secrets | `config.rs`, `main.rs:59` | Wrap in `Arc<AppConfig>` |
+| 3 | Excessive cloning in synthesis pipeline | `synthesis.rs` (59 clones) | Use `Arc<str>`, `Arc<Value>`, `Arc<Vec<String>>` for shared immutables |
+| 7 | N+1 individual INSERTs for article history | `article_history.rs:78-87` | Use batch INSERT or `UNNEST` |
+| 19 | `Selector::parse` called per-invocation | `scraper.rs` | Use `LazyLock` for static selectors |
+| 26 | No tests for generation pipeline | `synthesis.rs` | Introduce trait-based mocks for scraper/LLM/DB |
+
+### Low Priority (code quality/future-proofing)
+
+| # | Finding | File | Fix |
+|---|---------|------|-----|
+| 2 | `format!` allocation on every cookie segment | `middleware/auth.rs:40-41` | Pre-compute prefix string |
+| 9 | TOCTOU race in `JobStore::create_job` | `synthesis.rs:108-145` | Use DashMap entry API for atomic check-and-insert |
+| 12 | UUID parameters lack type distinction | Throughout | Introduce newtype wrappers |
+| 13 | Provider names are bare strings | Throughout | Use `ProviderName` enum |
+| 14 | `i32` to `u32` cast without validation | `rate_limiter.rs:188` | Use `TryFrom` or add DB constraint |
+| 16 | Repeated `format!("category_{}", i)` | `synthesis.rs` | Pre-compute category keys |
+| 24 | No length validation on LLM prompt inputs | Settings/synthesis | Cap `theme` and category name lengths |
+
+### Code Duplication Worth Addressing
+
+The **classify/summarize batch processing** logic appears twice in `synthesis.rs`: once for Phase 1 (personalized sources, lines ~458-541) and once for Phase 2 Brave search path (lines ~667-749). These are nearly identical blocks of ~80 lines each. Extracting a shared function like `classify_batch(articles, provider, model, categories, ...)` would significantly reduce the maintenance burden and risk of the two paths diverging.
+
+---
+
+## Appendix: Dependency Assessment
+
+| Crate | Version | Assessment |
+|-------|---------|------------|
+| `axum` | 0.8 | Current, well-maintained |
+| `sqlx` | 0.8 | Current, runtime-checked queries are a deliberate choice |
+| `tokio` | 1 | Standard async runtime, full features enabled |
+| `reqwest` | 0.12 | Current |
+| `aes-gcm` | 0.10 | Current, correct choice for authenticated encryption |
+| `zeroize` | 1 | Correctly applied to `MasterKey` |
+| `dashmap` | 6 | Good choice for concurrent maps, correctly used |
+| `async-trait` | 0.1 | Could be replaced by native async traits (Rust 1.75+) |
+| `anyhow` + `thiserror` | 1 + 2 | Standard error handling combo, correctly used |
+| `scraper` | 0.22 | HTML parsing, adequate for the use case |
+| `rand` | 0.8 | Note: 0.9 is available, but 0.8 is still widely used |
+
+No unnecessary or abandoned dependencies were identified. The dependency set is lean and focused.
--- a/docs/audit/tech-lead-report.md
+++ b/docs/audit/tech-lead-report.md
@ -0,0 +1,412 @@
+# Tech Lead Code Review Report
+
+**Date:** 2026-03-26
+**Codebase:** AI Weekly Synth (Rust/Axum + SolidJS)
+**Total Backend LOC:** ~14,263 (Rust)
+**Total Frontend LOC:** ~10,622 (TypeScript/TSX)
+
+---
+
+## Executive Summary
+
+AI Weekly Synth is a well-structured, well-documented codebase with clear module boundaries and strong security practices (SSRF prevention, AES-256-GCM encryption, anti-enumeration auth). The code quality is above average for a learning project.
+
+The primary concern is **`synthesis.rs` at 1,686 lines**, which contains a monolithic pipeline function (`run_generation_inner`, lines 243-890) spanning ~650 lines with significant copy-pasted code blocks between Phase 1 (personalized sources), Phase 2 Brave Search, and Phase 2 LLM Search. This file alone accounts for 12% of all backend code and is the single biggest maintainability risk.
+
+Secondary concerns include: triple-duplication of settings field lists across `UserSettings`, `SettingsResponse`, and `UpdateSettingsRequest`; nearly identical error-mapping functions across all three LLM providers; and a 1,025-line Settings.tsx that could benefit from component extraction.
+
+The codebase is otherwise clean: naming is consistent, comments are useful without being excessive, the error handling is centralized, and the module structure follows idiomatic Rust patterns. The recommendations below are ordered by impact-to-effort ratio.
+
+---
+
+## 1. Complexity Hotspots
+
+### 1.1 `backend/src/services/synthesis.rs` -- 1,686 lines (CRITICAL)
+
+This is by far the most complex file. Key concerns:
+
+- **`run_generation_inner`** (lines 243-890, ~650 lines): This single async function orchestrates the entire generation pipeline. It manages 8+ mutable tracking structures (`article_scraped`, `source_counts`, `url_source`, `filled_counts`, `seen_urls`, etc.) and contains 3 nearly identical "scrape + classify + assign category" loops.
+
+- **Deep nesting**: The Brave Search path (lines 562-759) is nested inside `if !category_gaps.is_empty()` > `if settings.use_brave_search` > `while !done` > `while let Some(join_result)`, reaching 5 levels of indentation.
+
+- **Copy-pasted classify logic**: The category assignment block (lines 505-541) is duplicated almost verbatim at lines 702-750 for Brave Search and implicitly again for the LLM search path. Each block does:
+  - Extract title/summary/category from LLM response
+  - Validate category against allowed list
+  - Compute `cat_key` string
+  - Check `filled_counts` for overflow to "Autre"
+  - Insert into `article_scraped`
+  - Update counters
+
+- **`trace_article` function** (line 926): Has 11 parameters and is called 15+ times throughout the pipeline with `#[allow(clippy::too_many_arguments)]`. This is a code smell that the function is doing too much parameter passing.
+
+### 1.2 `backend/src/services/scraper.rs` -- 1,242 lines
+
+Well-organized but large. 572 lines are production code, 670 lines are tests. The test-to-code ratio is excellent. The main `scrape_url` function is a reasonable 95 lines. No urgent action needed, but could benefit from splitting the test module into a separate file.
+
+### 1.3 `frontend/src/pages/Settings.tsx` -- 1,025 lines
+
+A single component handling theme, categories, max age, items per category, batch size, article history, LLM source links, Brave Search key management, provider selection, model selection, websearch model selection, rate limits, search behavior, export/import, and API key management. Contains:
+- 14 signals (state variables)
+- 10+ event handler functions
+- Inline JSX for every form section
+
+### 1.4 `frontend/src/pages/admin/Providers.tsx` -- 854 lines
+
+Similar to Settings: a single component managing CRUD for providers with inline model list editing. The model list editing JSX (for both `models_scraping` and `models_websearch`) is nearly identical, repeated twice per provider card and twice for the "new provider" form.
+
+### 1.5 `backend/src/services/llm/anthropic.rs` -- 500 lines
+
+Of these, ~270 lines are tests, which is healthy. The production code is clean.
+
+---
+
+## 2. Code Duplication
+
+### 2.1 Settings Field Lists (HIGH -- 4 locations)
+
+The same 16 fields are listed in 4 places:
+
+| Location | Type | Purpose |
+|----------|------|---------|
+| `models/settings.rs:9-27` | `UserSettings` | DB model |
+| `models/settings.rs:31-47` | `SettingsResponse` | API response |
+| `models/settings.rs:73-89` | `UpdateSettingsRequest` | API input |
+| `db/settings.rs:14-31` | `SettingsRow` | sqlx row type |
+
+Every time a new settings field is added (which has happened several times based on commit history), all 4 structs plus the `From` impl, the `TryFrom` impl, the SQL INSERT/UPDATE queries, and the validation function must all be updated in lockstep. The `SettingsResponse` is particularly pointless -- it has exactly the same fields as `UserSettings` minus `user_id` and `updated_at`.
+
+**Files affected:** `backend/src/models/settings.rs`, `backend/src/db/settings.rs`
+
+### 2.2 LLM Error Mapping (MEDIUM -- 3 providers)
+
+The `map_*_error` functions in all three providers are nearly identical:
+
+```
+// gemini.rs:151-178, openai.rs:158-189, anthropic.rs:194-228
+match status {
+    400 => AppError::BadRequest("Invalid request to LLM provider".into()),
+    401 => AppError::BadRequest("Invalid or unauthorized API key".into()),
+    // ...
+    429 => AppError::RateLimited("LLM provider rate limit exceeded...".into()),
+    _ => AppError::Internal(anyhow::anyhow!("LLM provider returned an error")),
+}
+```
+
+Only Anthropic adds a `529` case; Gemini merges `401|403`. The error extraction logic (reading `body["error"]["message"]`) differs slightly per provider but could be parameterized.
+
+**Files affected:** `backend/src/services/llm/gemini.rs`, `openai.rs`, `anthropic.rs`
+
+### 2.3 Scrape + Classify Pipeline (CRITICAL -- 3 code paths)
+
+The pattern "scrape batch in parallel -> classify batch in parallel -> assign to categories" appears 3 times in `synthesis.rs`:
+
+1. **Phase 1 Personalized Sources** (lines 429-551): scrape set -> classify set -> category assignment with source tracking
+2. **Phase 2 Brave Search** (lines 639-758): near-identical with minor tuple shape differences
+3. **Phase 2 LLM Search** (lines 822-843): simplified (scrape for validation only, no classify needed)
+
+Paths 1 and 2 differ only in:
+- Tuple shape: `(url, source_url, body_text, page_title)` vs `(url, body_text, page_title)`
+- Source tracking: Phase 1 uses `source_url` from `url_source`, Phase 2 uses `extract_domain`
+- Otherwise the classify/assign logic is character-for-character identical
+
+### 2.4 Article Filtering Logic (MEDIUM -- 3 locations)
+
+The "homepage filter + cross-phase dedup + history dedup + source diversity" filter pipeline appears in 3 places:
+- Phase 2 LLM search (lines 780-820)
+- Phase 2 Brave search (lines 575-616)
+- Phase 1 history filter (lines 376-387)
+
+Each applies the same checks in slightly different orders with identical logic.
+
+### 2.5 Frontend Model List JSX (LOW -- Providers.tsx)
+
+In `Providers.tsx`, the model list editing UI (add/remove/edit model rows) is repeated 4 times:
+- Existing provider `models_scraping` list
+- Existing provider `models_websearch` list
+- New provider `models_scraping` list
+- New provider `models_websearch` list
+
+### 2.6 Frontend Default Settings (LOW -- 2 locations)
+
+`DEFAULT_SETTINGS` is defined in `frontend/src/types.ts:60-83` and `UserSettings::default()` is defined in `backend/src/models/settings.rs:162-190`. The category lists are slightly different between the two (the frontend has longer, more descriptive French names). This divergence is a latent bug source.
+
+---
+
+## 3. Readability Assessment
+
+### Strengths
+
+- **Module-level documentation**: Every `.rs` file starts with `//!` module docs explaining purpose and listing endpoints. This is excellent and should be maintained.
+- **Function documentation**: Important functions have `///` doc comments with `# Arguments` sections. The scraper and synthesis modules are particularly well-documented.
+- **Naming conventions**: Consistent `snake_case` for Rust, `camelCase` for TypeScript. No single-letter variables except loop iterators.
+- **Error handling**: Centralized in `errors.rs` with clear variant-to-status-code mapping. Internal errors never leak to clients.
+- **Security comments**: The SSRF check has explicit notes about TOCTOU gaps (`scraper.rs:63`). The job store documents why `_rx` is kept alive. These are high-value comments.
+
+### Concerns
+
+- **Short variable names in synthesis.rs**: Variables like `u`, `su`, `mad`, `jid`, `uid`, `mdl`, `sys`, `usr` hurt readability in the already-complex pipeline code. These abbreviations exist because of async move closures requiring cloned values, but they make the code harder to follow.
+- **Magic numbers in synthesis.rs**: `max_links = 15`, `max_concurrent = 5`, progress percentages (5, 10, 12, 15, 25, 70, 75, 80, 90) are hardcoded inline. The progress percentages especially are fragile -- changing the pipeline structure requires manually recalculating all percent values.
+- **French strings in Rust**: Error messages in `synthesis.rs` are in French (e.g., "Aucun article valide trouve"), which is correct for this French-language app, but they are hardcoded rather than going through an i18n system. This is acceptable for a single-language self-hosted app.
+
+---
+
+## 4. Maintainability Risks
+
+### 4.1 Settings Field Addition Ceremony
+
+Adding one settings field requires changes to:
+1. `models/settings.rs` -- 3 structs + `Default` + `From` + `validate()`
+2. `db/settings.rs` -- `SettingsRow` + `TryFrom` + 2 SQL queries (INSERT + UPDATE with RETURNING)
+3. SQL migration
+4. `frontend/src/types.ts` -- `UserSettings` + `DEFAULT_SETTINGS`
+5. `frontend/src/pages/Settings.tsx` -- UI input
+6. `frontend/src/i18n/fr.ts` -- label
+
+This 6-file ceremony is a maintenance burden and has likely already produced bugs (note the divergent default categories between frontend and backend).
+
+### 4.2 Tight Coupling Between Pipeline Steps
+
+The generation pipeline in `synthesis.rs` is a single function with shared mutable state (`article_scraped`, `filled_counts`, `source_counts`, `seen_urls`). This means:
+- Cannot test individual phases in isolation
+- Cannot add a new phase without modifying the 650-line function
+- Cannot run phases in different orders for different configurations
+- Any bug in counter management affects all downstream phases
+
+### 4.3 Provider-Specific Behavior Spread Across Files
+
+Adding a new LLM provider requires:
+1. New file in `services/llm/` implementing `LlmProvider`
+2. Update `services/llm/factory.rs` match arm
+3. Potentially update `services/synthesis.rs` if the provider has unique behavior
+4. Update admin provider seed data
+
+This is mostly well-contained by the trait abstraction, but the factory pattern could be more extensible.
+
+### 4.4 Frontend types.ts as a God File
+
+`types.ts` (303 lines) defines every interface for the entire app. While currently manageable, it will become harder to navigate as the app grows. Types are not co-located with their usage.
+
+---
+
+## 5. Simplification Opportunities
+
+### 5.1 Eliminate `SettingsResponse`
+
+`SettingsResponse` is a strict subset of `UserSettings` (same fields minus `user_id` and `updated_at`). The handler could simply serialize `UserSettings` with `#[serde(skip)]` on the excluded fields, or use a `#[serde(rename)]` attribute, eliminating one of the 4 duplicate structs.
+
+### 5.2 Extract Category Assignment Logic
+
+The 40-line category assignment block (validate LLM category -> compute cat_key -> check filled_counts -> overflow to "Autre" -> insert into article_scraped -> update counters) is a pure function that can be extracted:
+
+```rust
+fn assign_to_category(
+    item: NewsItem,
+    llm_response: &Value,
+    categories: &[String],
+    filled_counts: &mut HashMap<String, usize>,
+    article_scraped: &mut HashMap<String, Vec<NewsItem>>,
+    max_per_cat: usize,
+) -> bool { ... }
+```
+
+### 5.3 Unify Scrape+Classify Pipeline
+
+Create a helper function:
+```rust
+async fn scrape_and_classify_batch(
+    urls: &[(String, Option<String>)],  // (url, optional_source_url)
+    provider: &Arc<dyn LlmProvider>,
+    model: &str,
+    schema: &Value,
+    categories: &[String],
+    settings: &UserSettings,
+    // tracking state refs
+) -> Vec<ClassifiedArticle>
+```
+
+This would eliminate ~200 lines of duplication between Phase 1 and Phase 2 Brave paths.
+
+### 5.4 Extract Article Filter Pipeline
+
+```rust
+fn should_include_article(
+    url: &str,
+    seen_urls: &HashSet<String>,
+    source_counts: &HashMap<String, usize>,
+    max_per_source: usize,
+) -> Result<(), &'static str>
+```
+
+### 5.5 Introduce `PipelineContext` Struct
+
+Replace the 8+ mutable tracking variables with a single struct:
+```rust
+struct PipelineContext {
+    article_scraped: HashMap<String, Vec<NewsItem>>,
+    source_counts: HashMap<String, usize>,
+    url_source: HashMap<String, String>,
+    filled_counts: HashMap<String, usize>,
+    seen_urls: HashSet<String>,
+    max_total: usize,
+}
+```
+
+This reduces parameter passing and makes the tracking state explicit.
+
+### 5.6 Shared LLM Error Mapper
+
+Move the common error mapping logic to `services/llm/mod.rs`:
+```rust
+pub fn map_provider_error(status: u16, provider_id: &str, message: &str) -> AppError
+```
+
+### 5.7 Frontend: Extract Settings Section Components
+
+Break `Settings.tsx` into:
+- `SettingsThemeSection`
+- `SettingsNumericFields`
+- `SettingsCategoriesSection`
+- `SettingsProviderSection`
+- `SettingsBraveSection`
+- `SettingsAdvancedSection`
+- `SettingsExportImport`
+
+Each would be 50-100 lines, down from 1,025 total.
+
+### 5.8 Frontend: Extract ModelListEditor Component
+
+In `Providers.tsx`, the model list UI (with add/remove/edit/default toggle) appears 4 times. A `ModelListEditor` component would receive `models`, `onChange`, and `label` props.
+
+### 5.9 Dead/Unused Code
+
+- `SettingsRow` in `db/settings.rs` could be generated by sqlx macros instead of manual struct + TryFrom, but this is a style preference rather than dead code.
+- The `JobStore::len()` and `JobStore::is_empty()` methods are only used in tests. They are correctly gated for testing purposes.
+
+---
+
+## 6. Detailed Refactoring Plan
+
+### Critical Priority
+
+#### C1. Extract Scrape+Classify Pipeline from `synthesis.rs`
+- **Description:** Extract the duplicated scrape-batch + classify-batch + assign-category logic into a reusable async function. Create a `PipelineContext` struct to hold the shared tracking state. This would reduce `run_generation_inner` from ~650 to ~300 lines.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Medium (1-2 days)
+- **Rationale:** This is the single highest-impact change. The current 3x duplication means any bug fix or feature addition must be applied in 3 places. The `run_generation_inner` function is too long to review effectively.
+
+#### C2. Extract Article Filter Pipeline
+- **Description:** Create a `filter_candidate_url()` function that applies homepage check, cross-phase dedup, history dedup, and source diversity in one place.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Small (2-4 hours)
+- **Rationale:** The same filtering logic is duplicated 3 times with minor variations. A single function with configuration parameters eliminates this.
+- **Dependencies:** Can be done independently or as part of C1.
+
+### High Priority
+
+#### H1. Eliminate `SettingsResponse` Struct Duplication
+- **Description:** Use `#[serde(skip_serializing)]` on `user_id` and `updated_at` in `UserSettings`, and serialize it directly in the handler. Remove `SettingsResponse` and its `From` impl entirely.
+- **Files affected:** `backend/src/models/settings.rs`, `backend/src/handlers/settings.rs`
+- **Estimated effort:** Small (1-2 hours)
+- **Rationale:** Eliminates one of the 4 duplicate field lists with zero risk.
+
+#### H2. Introduce `PipelineContext` Struct
+- **Description:** Group `article_scraped`, `source_counts`, `url_source`, `filled_counts`, `seen_urls`, and `max_total` into a `PipelineContext` struct with methods like `is_full()`, `add_article()`, `track_source()`.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Medium (4-6 hours)
+- **Rationale:** Reduces cognitive load when reading the pipeline. Makes the tracking state self-documenting. Enables methods like `ctx.add_article()` that encapsulate the category assignment logic.
+- **Dependencies:** Best done alongside C1.
+
+#### H3. Replace `trace_article` 11-Parameter Function with a Struct
+- **Description:** Create `TraceEntry` struct and pass it to `trace_article()`. This is already partially done with `ArticleHistoryEntry` in `db/article_history.rs` but `trace_article()` constructs it from 11 individual parameters.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Small (2-3 hours)
+- **Rationale:** 11-parameter functions violate Rust idioms. A builder pattern or struct makes each call site clearer and less error-prone.
+
+### Medium Priority
+
+#### M1. Extract Shared LLM Error Mapping
+- **Description:** Create `map_provider_error(status, provider_name, error_message, error_type) -> AppError` in `services/llm/mod.rs`. Each provider calls this with its parsed error details.
+- **Files affected:** `backend/src/services/llm/mod.rs`, `gemini.rs`, `openai.rs`, `anthropic.rs`
+- **Estimated effort:** Small (2-3 hours)
+- **Rationale:** Eliminates 3x duplication of the same match-on-status-code pattern. Makes it easy to add consistent error handling for new providers.
+
+#### M2. Break Up `Settings.tsx` into Section Components
+- **Description:** Extract 6-7 section components (Theme, NumericFields, Categories, Provider, Brave, Advanced, ExportImport). The parent `Settings` component becomes a form container with save logic.
+- **Files affected:** `frontend/src/pages/Settings.tsx`, new files in `frontend/src/components/settings/`
+- **Estimated effort:** Medium (4-6 hours)
+- **Rationale:** 1,025 lines is too large for a single component. Section extraction makes each section independently testable and easier to navigate.
+
+#### M3. Extract `ModelListEditor` Component from `Providers.tsx`
+- **Description:** Create a reusable `ModelListEditor` component that accepts `models`, `onChange`, `label`, and `onAddModel` props. Use it 4 times in `Providers.tsx`.
+- **Files affected:** `frontend/src/pages/admin/Providers.tsx`, new `frontend/src/components/admin/ModelListEditor.tsx`
+- **Estimated effort:** Small (2-3 hours)
+- **Rationale:** Eliminates 4x JSX duplication of the model editing UI.
+
+#### M4. Name Pipeline Progress Percentages as Constants
+- **Description:** Define named constants for progress stages: `PROGRESS_INIT = 5`, `PROGRESS_SOURCES = 10`, `PROGRESS_PROVIDER = 12`, etc. Better yet, compute them from a stage list.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Small (1-2 hours)
+- **Rationale:** Progress percentages are currently magic numbers scattered through 650 lines. Named constants make them self-documenting and easier to adjust.
+
+#### M5. Align Frontend/Backend Default Settings
+- **Description:** The `DEFAULT_SETTINGS` in `types.ts` and `UserSettings::default()` in Rust have different category lists. Choose one as the source of truth. Ideally the backend defaults should be authoritative, and the frontend should fetch them on first load (which it already does -- the frontend defaults are only used before the first API call).
+- **Files affected:** `frontend/src/types.ts`, `backend/src/models/settings.rs`
+- **Estimated effort:** Small (1 hour)
+- **Rationale:** Divergent defaults can cause subtle bugs when a user's first-load experience differs from what the backend creates.
+
+### Low Priority
+
+#### L1. Split `scraper.rs` Tests into a Separate File
+- **Description:** Move the 670-line test module from `scraper.rs` to `scraper_tests.rs` or use `#[path]` to keep them co-located but in a separate file.
+- **Files affected:** `backend/src/services/scraper.rs`
+- **Estimated effort:** Small (30 minutes)
+- **Rationale:** The file is large but most of it is tests. Moving tests out keeps the main file focused on production code.
+
+#### L2. Use Builder Pattern for `PdfWriter`
+- **Description:** The `PdfWriter::new()` takes 6 parameters (doc, page, layer, font_regular, font_bold, font_italic). A builder would be cleaner.
+- **Files affected:** `backend/src/services/export.rs`
+- **Estimated effort:** Small (1-2 hours)
+- **Rationale:** Minor improvement. The current code works fine.
+
+#### L3. Co-locate Frontend Types with API Modules
+- **Description:** Move type definitions from `types.ts` to their respective API modules (e.g., `Source` to `api/sources.ts`, `Synthesis` to `api/syntheses.ts`). Keep shared types like `ApiError` in `types.ts`.
+- **Files affected:** `frontend/src/types.ts`, multiple API files
+- **Estimated effort:** Medium (2-3 hours)
+- **Rationale:** Better co-location makes it easier to find and maintain types. However, the current approach works and 303 lines is still manageable.
+
+#### L4. Improve Variable Names in Async Closures
+- **Description:** In `synthesis.rs`, replace abbreviated variables in async move closures (`u`, `su`, `mad`, `jid`) with descriptive names (`url_to_scrape`, `source_url_ref`, `max_age`, `job_id_ref`). The abbreviations exist because async closures require cloning, but descriptive names are worth the extra characters.
+- **Files affected:** `backend/src/services/synthesis.rs`
+- **Estimated effort:** Small (1-2 hours)
+- **Rationale:** Minor readability improvement in the most complex file.
+
+---
+
+## Appendix: File Size Summary (Top 15)
+
+### Backend (Rust)
+| File | Lines | Notes |
+|------|-------|-------|
+| `services/synthesis.rs` | 1,686 | ~650 lines pipeline + ~400 helpers + ~600 tests |
+| `services/scraper.rs` | 1,242 | ~570 code + ~670 tests |
+| `services/llm/anthropic.rs` | 500 | ~230 code + ~270 tests |
+| `services/rate_limiter.rs` | 471 | Clean, well-structured |
+| `services/export.rs` | 456 | PDF generation |
+| `handlers/admin.rs` | 438 | Standard CRUD, clean |
+| `models/settings.rs` | 433 | 4 structs with field duplication |
+| `services/source_scraper.rs` | 428 | Link extraction from source pages |
+| `services/llm/openai.rs` | 399 | ~190 code + ~200 tests |
+| `services/prompts.rs` | 390 | Prompt templates + tests |
+
+### Frontend (TypeScript/TSX)
+| File | Lines | Notes |
+|------|-------|-------|
+| `pages/Settings.tsx` | 1,025 | Monolithic settings form |
+| `pages/admin/Providers.tsx` | 854 | Provider CRUD with duplicated model UI |
+| `pages/Sources.tsx` | 488 | Manageable |
+| `pages/SynthesisDetail.tsx` | 476 | Clean component extraction |
+| `i18n/fr.ts` | 394 | Translation strings |
+| `pages/GenerateSynthesis.tsx` | 404 | Generation with SSE progress |
+| `pages/ArticleHistory.tsx` | 321 | Table with filtering |
+| `components/ApiKeyManager.tsx` | 313 | API key CRUD component |
+| `types.ts` | 303 | All TypeScript interfaces |