diff --git a/docs/audit/v2-architect-report.md b/docs/audit/v2-architect-report.md new file mode 100644 index 0000000..1f57300 --- /dev/null +++ b/docs/audit/v2-architect-report.md @@ -0,0 +1,434 @@ +# AI Weekly Synth -- Architecture Audit Report (v2) + +**Date**: 2026-03-27 +**Scope**: Full backend codebase (Rust/Axum), key frontend architecture observations +**Auditor**: Software Architect (automated) + +--- + +## Executive Summary + +AI Weekly Synth is a well-structured Rust/Axum application that has grown substantially from its initial design. The codebase demonstrates strong fundamentals: consistent error handling, good security practices, clean layer separation between handlers/services/db, and idiomatic use of Axum extractors. Unit test coverage is solid across models, services, and middleware. + +However, the growth -- particularly the addition of multi-theme synthesis, scheduled generation, Brave Search, windowed source extraction, and article history tracing -- has introduced several architectural tensions. The synthesis pipeline (`synthesis.rs`) has become a 1500+ line monolith carrying at least five distinct responsibilities. The scheduler bypasses the job store abstraction. And several cross-cutting concerns (provider resolution, rate limiting, history tracking) are tightly coupled to concrete implementations, making the system harder to test, extend, and reason about. + +This report organizes findings by SOLID principles, design patterns, architecture, and dependency management, then closes with prioritized recommendations. + +--- + +## 1. SOLID Principles + +### 1.1 Single Responsibility Principle (SRP) + +**Critical: `synthesis.rs` is a God Module** + +At 1500+ lines, `services/synthesis.rs` carries at least six distinct responsibilities: + +| Responsibility | Lines (approx.) | Should be | +|---|---|---| +| Job store (in-memory concurrent map) | 1-193 | Own module `services/job_store.rs` | +| Progress event types + emission | 37-71, 1063-1071 | Own module or part of job store | +| Pipeline orchestration (phases 1 + 2 + save) | 200-1038 | `services/pipeline.rs` or `services/generation/mod.rs` | +| Article scraping + classification logic | 471-616, 700-880 | `services/article_processor.rs` | +| URL filtering, normalization, hashing | 1255-1339 | `services/url_utils.rs` | +| Provider/model/key resolution | 1342-1446 | `services/provider_resolver.rs` | + +The `run_generation_inner` function alone is ~840 lines. It manages five `HashMap`/`HashSet` tracking structures, two nested loop levels (waves and batches), two separate pipeline phases (personalized sources and web search fallback), and three code paths in Phase 2 (Brave Search, LLM search, skip). This makes the function extremely difficult to test in isolation, review for correctness, or extend with new pipeline stages. + +**Moderate: `scheduler.rs` duplicates pipeline invocation logic** + +The scheduler constructs its own `watch::channel` and `AtomicBool`, calls `run_generation_inner` directly, and handles email sending inline. It bypasses the `JobStore` entirely, which means: +- Scheduled jobs are invisible to the SSE progress API +- The one-job-per-user guard does not apply (it only checks `job_store.has_active_job`) +- Email sending logic (fetch synthesis, iterate recipients, call `email::send_synthesis_email`) is duplicated -- the handler version is in `syntheses.rs` handler + +**Moderate: `AppState` accumulates responsibilities** + +`AppState` holds configuration, database pool, HTTP client, auth rate limiter, provider rate limiter, per-user rate limiters, and the job store. While `Clone`-cheap (all Arc-based), it acts as a service locator, making it unclear which components a given handler actually depends on. With 8 fields, this is approaching the point where injecting specific dependencies would improve clarity. + +**Minor: Handler-level response types** + +Some response types are defined in handlers (`AdminUserResponse` in `handlers/admin.rs`, `GenerateResponse` in `handlers/generation.rs`, `ListResponse` in `handlers/syntheses.rs`) while others are in models. This inconsistency is minor but creates ambiguity about where to look for types. + +### 1.2 Open/Closed Principle (OCP) + +**Well-applied: LLM Provider abstraction** + +The `LlmProvider` trait + factory pattern is the cleanest abstraction in the codebase. Adding a new provider (e.g., Mistral) requires: +1. A new module implementing `LlmProvider` +2. A new match arm in `factory.rs` +3. No changes to the pipeline + +This is textbook OCP. + +**Violation: Pipeline Phase 2 branching** + +Phase 2 of the pipeline has a hard-coded `if settings.use_brave_search { ... } else { ... }` branch that selects between two entirely different code paths (Brave Search vs. LLM web search). Each path contains ~150 lines of nearly identical scrape+classify logic. Adding a third search strategy (e.g., Bing, Perplexity, SearXNG) would require another `else if` branch with the same duplicated scrape/classify logic. + +**Violation: Provider resolution fallback defaults** + +`resolve_model` contains hard-coded fallback model names (`"gemini-2.5-pro"`, `"gpt-4o"`, `"claude-sonnet-4-20250514"`). These will silently become stale as providers release new models. The fallback chain should be configurable or fail loudly. + +### 1.3 Liskov Substitution Principle (LSP) + +**Generally well-respected.** The `LlmProvider` trait implementations are fully substitutable. The `MockLlmProvider` correctly implements the same interface and is used in tests via the `provider_override` parameter. + +**Minor concern**: The mock provider identifies call types by inspecting the system prompt content (`sys_lower.contains("classer")`, `sys_lower.contains("precis")`). This couples the mock to the French-language prompt wording, making it fragile if prompts change. + +### 1.4 Interface Segregation Principle (ISP) + +**Well-applied: Axum extractors** + +`AuthUser` and `AdminUser` are clean, focused extractors. Handlers declare exactly the auth level they need. The `AdminUser` wrapper pattern (newtype over `AuthUser`) is idiomatic and minimal. + +**Opportunity: `LlmProvider` trait could be narrower** + +The current trait has two methods: `provider_id()` and `call_llm()`. If the codebase later needs streaming, embedding, or tool-calling capabilities, the trait should be split rather than extended, per ISP. + +### 1.5 Dependency Inversion Principle (DIP) + +**Critical: Pipeline depends on concrete implementations** + +`run_generation_inner` directly calls: +- `db::settings::get_or_create_default` (concrete DB queries) +- `db::themes::get_by_id` (concrete DB queries) +- `db::sources::list_for_user` (concrete DB queries) +- `db::article_history::check_urls_exist` (concrete DB queries) +- `db::article_history::batch_insert_entries` (concrete DB queries) +- `crate::services::scraper::scrape_url` (concrete HTTP scraping) +- `source_scraper::extract_article_links` (concrete link extraction) +- `crate::services::brave_search::search` (concrete Brave API) +- `encryption::decrypt` / `encryption::MasterKey::from_hex` (concrete crypto) + +None of these are injected as traits. The `provider_override` parameter for `LlmProvider` is the only dependency that can be swapped -- and it was added specifically for testing. This makes the pipeline untestable without a live Postgres database and network access. + +**Moderate: `ProviderRateLimiter` embeds its own SQL** + +The `ProviderRateLimiter::reload_from_db` method contains raw `sqlx::query_as` calls rather than going through the `db::rate_limits` module. The comment says "to avoid circular dependency," but this violates the layer boundary and duplicates the DB schema knowledge. + +--- + +## 2. Design Patterns + +### 2.1 Well-Applied Patterns + +| Pattern | Where | Assessment | +|---|---|---| +| **Factory Method** | `llm/factory.rs` | Clean, tested, extensible | +| **Strategy** | `LlmProvider` trait | Proper polymorphism via `Arc` | +| **Observer** | `watch::channel` for SSE | Elegant use of tokio primitives; late subscribers get latest state | +| **Repository** | `db/` modules | Clean separation of SQL from business logic | +| **Extractor** | `AuthUser`, `AdminUser` | Idiomatic Axum; composable auth | +| **Builder** | `AppState::new`, `build_scraper_client` | Consistent construction patterns | +| **Newtype** | `AdminUser(AuthUser)`, `MasterKey` | Type safety for authorization and crypto | + +### 2.2 Missing or Needed Patterns + +**Pipeline / Chain of Responsibility** + +The synthesis generation is conceptually a pipeline with discrete stages: +1. Load settings + theme + sources +2. Phase 1: Source extraction (windowed) +3. Phase 1: Scrape + classify (batched) +4. Phase 2: Web search fallback (Brave or LLM) +5. Phase 2: Scrape + classify fallback results +6. Assemble sections + save + +Each stage could be a separate struct implementing a `PipelineStage` trait, with shared context passed through. This would make the pipeline testable per-stage, enable adding/removing stages declaratively, and reduce `run_generation_inner` from 840 lines to ~50. + +**Unit of Work / Transaction Manager** + +Article history tracing uses a manual `pending_traces` buffer that is flushed at "logical boundaries." This ad-hoc batching is scattered across 7 locations in the pipeline. A dedicated `TraceBatcher` struct could encapsulate the buffer, auto-flush thresholds, and error handling. + +**Event Sourcing (lightweight)** + +The `ProgressEvent` enum is close to an event-sourced model but is currently fire-and-forget via `watch::channel` (which only retains the latest value). If the system needs progress history for debugging or UI replay, the events should be collected in a log alongside the watch channel. + +### 2.3 Anti-Patterns + +**Copy-Paste Programming (Critical)** + +The scrape+classify logic appears nearly identically in three places: +1. Phase 1 source processing (lines ~470-616) +2. Phase 2 Brave Search processing (lines ~700-880) +3. Phase 2 LLM search validation (lines ~936-960, simpler variant) + +Each instance: spawns a `JoinSet` for scraping, collects results, checks rate limits, spawns a `JoinSet` for classification, parses LLM responses, checks `is_article`, extracts dates, assigns categories, updates tracking maps. The only differences are: which `source_type` string is recorded and whether `source_url` is `Some`. + +A single `scrape_and_classify_batch` function parameterized by source type would eliminate ~300 lines of duplication. + +**Primitive Obsession** + +The pipeline uses six `HashMap`/`HashSet` variables (`article_scraped`, `source_counts`, `url_source`, `filled_counts`, `seen_urls`, and the `pending_traces` vec) as raw tracking state. These represent a coherent concept -- "pipeline context" or "generation state" -- and should be bundled into a struct: + +```rust +struct GenerationContext { + articles_by_category: HashMap>, + source_domain_counts: HashMap, + url_to_source: HashMap, + category_fill_counts: HashMap, + seen_urls: HashSet, + pending_traces: Vec, +} +``` + +**Magic Strings** + +Category keys like `"category_0"`, `"category_autre"`, `"category_no_date"`, and status strings like `"filtered_history"`, `"filtered_diversity"`, `"filtered_not_article"`, `"filtered_too_old"`, `"filtered_empty"`, `"filtered_homepage"`, `"filtered_cross_phase_dedup"`, `"used"` are scattered as string literals. These should be constants or an enum. + +**Stringly-Typed Configuration** + +Several settings use strings where enums would be safer: +- `settings.ai_provider` (could be `enum Provider { Gemini, OpenAi, Anthropic }`) +- `settings.search_agent_behavior` (free-form, but could at least validate non-HTML) +- `synthesis.status` (always `"completed"` in the codebase, but stored as `String`) + +--- + +## 3. Architecture + +### 3.1 Layer Separation + +The codebase follows a three-layer architecture: + +``` +handlers/ (HTTP layer) --> services/ (business logic) --> db/ (data access) + | + models/ (shared types) +``` + +**Assessment: Good but with leaks.** + +- Handlers are thin and focused. They validate input, call services/db, and format responses. This is excellent. +- The `db/` layer is clean -- pure SQL queries returning typed results. No business logic leaks into SQL. +- The `services/` layer is where responsibilities blur. `synthesis.rs` calls `db::` modules directly (bypassing any service abstraction), constructs its own SQL in `resolve_model`, and embeds scraping/classification logic that could be separate services. + +**Concern: The scheduler sits at the service layer but orchestrates at the handler level** + +The scheduler calls `synthesis::run_generation_inner` (a service) but also does email sending (another service), DB fetching (data layer), and schedule marking (data layer) all inline. It should either be a handler (if it needs to compose services) or delegate to a higher-level "generation + notification" service. + +### 3.2 Error Handling + +**Strengths:** +- Unified `AppError` enum with `IntoResponse` -- all errors produce consistent JSON +- Internal errors log full details but return generic messages to clients (security-conscious) +- `From` and `From` conversions are clean +- Error sanitization in `sanitize_error_message` prevents API key leakage + +**Weaknesses:** +- Errors are silently swallowed in multiple places via `.ok()`: + - `db::article_history::batch_insert_entries(...).await.ok()` (7 occurrences) -- if tracing fails, there is no indication + - `db::llm_call_log::truncate_old(...).await.ok()` -- cleanup failure is invisible + - `db::schedules::mark_run(...).await.ok()` -- if this fails, the schedule may fire again next minute +- `unwrap_or_default()` on `serde_json::from_value` calls silently drops malformed data (e.g., `theme.categories` deserialization). A warning log would be more appropriate. + +### 3.3 State Management + +**In-memory state:** +- `JobStore` (DashMap-based) -- well-designed with TTL cleanup, user locking, and cancellation support +- `RateLimiter` / `ProviderRateLimiter` -- properly `Arc`-wrapped for `Clone`-cheap sharing +- `user_rate_limiters: DashMap` -- handles setting changes atomically + +**Concern: No persistence for job state** + +If the server restarts during a generation, the in-memory job is lost with no way to recover. For a self-hosted single-instance app this may be acceptable, but if resilience is a goal, the job state should be backed by the database. + +**Concern: Scheduled job email state is fire-and-forget** + +The scheduler sends emails to up to 3 recipients per schedule. If one email fails, the others still send, but there is no retry or notification mechanism. `mark_run` is called unconditionally after a successful generation, even if all emails failed. + +### 3.4 Concurrency Model + +**Strengths:** +- Proper use of `tokio::task::JoinSet` for parallel scraping and classification +- `DashMap` + `DashSet` for lock-free concurrent access to shared state +- `AtomicBool` for cooperative cancellation (avoids mutex overhead) +- `watch::channel` for fan-out progress notifications + +**Weakness: Global rate limiter shared across scheduled + manual jobs** + +The `ProviderRateLimiter` is global. A scheduled job and a manual job for different users hitting the same provider share the same bucket. Under load, scheduled jobs could starve manual users (or vice versa). The architecture should consider per-user-or-per-job rate tracking for fairness. + +### 3.5 Security Architecture + +**Strengths:** +- AES-256-GCM encryption for API keys at rest with per-key nonces +- SSRF prevention in both `scraper.rs` and `source_scraper.rs` (IP allowlist checking, redirect validation) +- CSRF protection via `X-Requested-With` header check on all mutating API endpoints +- Session cookies are `HttpOnly`, `SameSite=Lax`, optionally `Secure` +- Anti-enumeration in auth (same response for existent/non-existent emails, timing attack mitigation) +- Error sanitization prevents API key leakage in SSE error events +- CSP, X-Frame-Options, HSTS, Referrer-Policy headers + +**Concern: Gemini API key in URL** + +The `GeminiProvider` constructs the API URL as `...?key={api_key}`. While the error handler carefully avoids logging the full URL, the key is still in the URL query string. This means: +- It may appear in HTTP access logs on intermediary proxies +- `reqwest` may include it in error messages despite the `kind`-only logging +- If tracing is set to DEBUG level, the URL may be logged by tower-http's `TraceLayer` + +This is a known Gemini API design limitation, but the risk should be documented. + +--- + +## 4. Dependency Management and Testability + +### 4.1 Test Architecture + +**Strengths:** +- Unit tests for all model validation logic (settings, theme, schedule, source, synthesis) +- Unit tests for error handling, rate limiting, URL normalization, link extraction +- Mock LLM provider enables end-to-end pipeline testing without real API calls +- Factory tests verify correct provider instantiation + +**Weaknesses:** +- The core pipeline (`run_generation_inner`) cannot be unit-tested. It requires: + - A live `PgPool` (for all `db::` calls) + - A real `AppState` (for config, rate limiters, job store) + - Network access (for scraping via `http_client`) + - Only the LLM provider can be mocked (via `provider_override`) + +- No integration tests for the scheduler +- No tests for the Brave Search integration path +- No tests for Phase 2 (web search fallback) at all + +### 4.2 Dependency Injection + +The codebase uses Axum's `State(AppState)` extractor as its sole DI mechanism. This works well for handlers but breaks down for services: + +- Services receive `&AppState` directly, gaining access to everything +- There is no trait boundary between the pipeline and its dependencies (db, scraper, search) +- The `provider_override: Option>` parameter proves the value of DI -- it is the only seam that enables testing + +**Recommendation**: Introduce a `PipelineDeps` trait (or struct of trait objects) that the pipeline receives, encapsulating: +- Database access (settings, sources, themes, article history) +- Scraping (source page scraping, article scraping) +- Search (Brave, LLM web search) +- Rate limiting +- Key resolution + +This would allow the entire pipeline to be tested with in-memory fakes. + +### 4.3 Module Coupling + +The dependency graph is mostly clean: + +``` +handlers --> services --> db + | | + models <-----+ + | + errors +``` + +Exceptions: +- `synthesis.rs` calls `db::` directly (bypasses service layer for settings, themes, sources, history, llm_call_log, syntheses) +- `synthesis.rs` calls `crate::services::prompts`, `crate::services::llm::schema`, `crate::services::scraper`, `crate::services::source_scraper`, `crate::services::brave_search`, `crate::services::encryption` -- essentially importing the entire services layer +- `rate_limiter.rs` contains its own SQL queries +- `handlers/syntheses.rs::list` constructs a `Synthesis` model manually from the `SynthesisWithThemeName` join row, duplicating field mapping + +--- + +## 5. Specific Code-Level Findings + +### 5.1 The `#[allow(clippy::too_many_arguments)]` Smell + +Two functions suppress this lint: +- `build_search_prompt` (9 parameters) +- `log_llm_call` (10 parameters) + +Both are symptoms of missing context objects. `build_search_prompt` should take a `SearchPromptConfig` struct. `log_llm_call` should take a `LlmCallRecord` struct. + +### 5.2 `run_generation_inner` Parameter List + +The function takes 7 parameters: `job_id`, `state`, `user_id`, `theme_id`, `tx`, `provider_override`, `cancelled`. The first four are "what to generate," the last three are "infrastructure." A `GenerationJob` struct and a `PipelineInfra` struct would make intent clearer. + +### 5.3 Inconsistent `serde_json::Value` vs. Typed Models + +`theme.categories` is stored as `serde_json::Value` and deserialized inline with `serde_json::from_value(...).unwrap_or_default()`. This pattern appears at least 5 times across the codebase (themes, schedules, syntheses). Consider using `sqlx::types::Json>` for typed extraction at the query level. + +### 5.4 French/English Mixing + +User-facing messages are consistently in French (good for i18n consistency), but code-level strings mix languages: +- Status strings: `"filtered_history"` (English), `"category_autre"` (French), `"Articles sans date"` (French) +- Log messages: English +- Error messages to users: French + +The recommendation is to keep all internal identifiers in English and use the i18n layer for user-facing strings. + +--- + +## 6. Prioritized Recommendations + +### P0 -- High Impact, Do First + +1. **Extract `GenerationContext` struct** from the 6 tracking variables in `run_generation_inner`. This is a safe refactor that immediately improves readability and reduces parameter passing. + +2. **Extract `scrape_and_classify_batch` function** to eliminate the 300-line duplication between Phase 1, Brave Search, and LLM search paths. Parameterize by `source_type: &str` and `source_url: Option<&str>`. + +3. **Move `JobStore` to its own module** (`services/job_store.rs`). It is already self-contained with no dependencies on synthesis logic. This reduces `synthesis.rs` by ~190 lines. + +### P1 -- Important, Do Soon + +4. **Introduce `TraceBatcher`** struct to encapsulate the pending traces buffer, batch insert calls, and error handling. Replace the 7 manual flush sites with `batcher.flush()`. + +5. **Make the scheduler use `JobStore`** for scheduled jobs. This provides visibility into scheduled generation progress and prevents race conditions between scheduled and manual jobs. Add email sending as a post-completion hook rather than inline in the scheduler. + +6. **Replace magic strings with constants or enums** for article statuses (`"filtered_history"`, etc.), category keys (`"category_0"`, `"category_autre"`, `"category_no_date"`), and synthesis status. + +7. **Add structured logging to `.ok()` calls**. Replace `batch_insert_entries(...).await.ok()` with `if let Err(e) = batch_insert_entries(...).await { tracing::warn!(...) }`. + +### P2 -- Improve Quality + +8. **Split Phase 2 search strategies** into a `SearchStrategy` trait with `BraveSearchStrategy` and `LlmSearchStrategy` implementations. The pipeline would call `strategy.search(query, max_results)` without knowing which backend is used. + +9. **Extract `provider_resolver.rs`** for the provider/key/model resolution logic (~100 lines currently in `synthesis.rs`). + +10. **Introduce `PipelineDeps` trait or struct** to enable full pipeline testing without Postgres/network. Start with the article history check as the first dependency to extract, since it is the most frequently called. + +11. **Remove inline SQL from `rate_limiter.rs`** and `synthesis.rs::resolve_model`. Route all queries through `db/` modules. + +### P3 -- Nice to Have + +12. **Type the `categories` field** as `sqlx::types::Json>` instead of `serde_json::Value` to eliminate runtime deserialization. + +13. **Consolidate response types** -- either all in `models/` or all in `handlers/`, with a consistent convention. + +14. **Add a `SearchPromptConfig` struct** to replace the 9-parameter `build_search_prompt` function. + +15. **Document the TOCTOU risk** in the Gemini API key URL pattern and consider using the `x-goog-api-key` header instead (if supported by the Gemini API version in use). + +--- + +## 7. What the Codebase Does Well + +It is important to acknowledge the strengths that should be preserved during refactoring: + +- **Error handling discipline**: The `AppError` enum is consistently used everywhere. No panics in production code. Internal details are never leaked. +- **Security-first design**: SSRF prevention, encrypted secrets, CSRF protection, anti-enumeration, session management -- all implemented correctly. +- **Idiomatic Axum usage**: Extractors, state management, middleware composition, SSE streaming -- all follow framework conventions. +- **Test coverage on leaf components**: Models, utils, and isolated services have thorough unit tests with boundary cases. +- **Documentation**: Module-level doc comments, function-level doc comments, and inline comments explaining non-obvious decisions (e.g., the TOCTOU note in `scraper.rs`). +- **Operational features**: Graceful shutdown, session cleanup, job TTL, rate limit hot-reload, cooperative cancellation -- these show production-mindedness. + +--- + +## Appendix: File Size Summary + +| File | Lines | Assessment | +|---|---|---| +| `services/synthesis.rs` | ~1550 | Critical -- needs decomposition | +| `services/scraper.rs` | ~400 | Acceptable | +| `services/rate_limiter.rs` | ~470 | Acceptable (includes thorough tests) | +| `services/prompts.rs` | ~370 | Acceptable (includes thorough tests) | +| `handlers/auth.rs` | ~380 | Acceptable | +| `handlers/sources.rs` | ~280 | Good | +| `handlers/admin.rs` | ~440 | Acceptable | +| `handlers/syntheses.rs` | ~240 | Good | +| `handlers/generation.rs` | ~180 | Good | +| `models/settings.rs` | ~260 | Good (includes thorough tests) | +| `models/synthesis.rs` | ~415 | Acceptable (includes thorough tests) | +| `errors.rs` | ~173 | Good | +| `app_state.rs` | ~82 | Good | +| `router.rs` | ~178 | Good | +| `scheduler.rs` | ~93 | Good size, but needs architectural changes | diff --git a/docs/audit/v2-consolidated-summary.md b/docs/audit/v2-consolidated-summary.md new file mode 100644 index 0000000..b86b4ee --- /dev/null +++ b/docs/audit/v2-consolidated-summary.md @@ -0,0 +1,100 @@ +# Code Audit v2 — Consolidated Summary + +**Date**: 2026-03-27 +**Auditors**: Software Architect, Frontend Expert, Rust Expert, Tech Lead, QA Engineer + +## Overall Assessment + +The codebase has strong fundamentals — zero `unsafe`, zero `unwrap()` in production, excellent security (SSRF, encryption, CSRF), clean error handling, and 689 total tests. The project has grown significantly with multi-theme, scheduling, preferred sources, and Brave Search. The main issues are accumulated complexity in `synthesis.rs` and `ThemeManager.tsx`. + +--- + +## Top Issues (agreed by 3+ auditors) + +### 1. `synthesis.rs` God Module — 2010 lines, ~800-line main function + +**Confirmed by**: Architect, Tech Lead, Rust Expert + +The scrape+classify batch loop is duplicated across Phase 1 and Brave Search paths (~300 lines). The function manages 6+ tracking HashMaps and mixes job store, pipeline orchestration, URL utilities, and provider resolution. + +**Recommended refactoring:** +- Extract `scrape_and_classify_batch()` — shared helper (~300 lines saved) +- Extract `GenerationContext` struct for tracking state (filled_counts, source_counts, etc.) +- Move `JobStore` to its own module (`services/job_store.rs`) +- Split into `synthesis/pipeline.rs`, `synthesis/job_store.rs`, `synthesis/helpers.rs` + +### 2. `ThemeManager.tsx` — 935 lines with ~30 signals + +**Confirmed by**: Frontend Expert, Tech Lead + +Mixes theme content editing, source management, bulk import, CSV import/export, preferred toggles, and schedule config in one component. + +**Recommended refactoring:** +- Extract `SourceManager` component (reusable, shared with Sources.tsx) +- Extract `ThemeContentForm` component (content settings) +- Already have `SettingsSchedule` and `SettingsBraveSearch` as sub-components + +### 3. Scheduler has zero tests + +**Confirmed by**: QA Engineer, Rust Expert + +`run_scheduled_jobs()` is an autonomous background process that triggers generation and sends emails. No unit, integration, or E2E tests. + +### 4. Hardcoded French strings bypass i18n + +**Confirmed by**: Frontend Expert, Tech Lead + +"Lire la suite", "Reduire", day labels in SynthesisDetail.tsx and other pages use hardcoded French instead of the i18n system. + +--- + +## High Priority Issues + +| # | Issue | Source | Impact | +|---|-------|--------|--------| +| 5 | No concurrency limit on JoinSet spawns | Rust Expert | Under load, hundreds of parallel HTTP requests | +| 6 | Silent `.ok()` on DB trace writes | Rust Expert | Lost traceability data without logging | +| 7 | Duplicated source management logic (~80%) between Sources.tsx and ThemeManager.tsx | Tech Lead | Maintenance burden | +| 8 | `createEffect` with fire-and-forget async in ThemeManager | Frontend Expert | Race conditions on rapid theme switching | +| 9 | SSE progress stream untested at integration level | QA Engineer | Only tested via flaky E2E | +| 10 | Date extraction/max_age filtering untested | QA Engineer | Pipeline filtering logic uncovered | + +--- + +## Medium Priority Issues + +| # | Issue | Source | +|---|-------|--------| +| 11 | `serde_json::Value` for categories/days/emails — should be `Vec` | Rust Expert | +| 12 | Inconsistent feedback patterns (inline banners vs Toast) | Frontend Expert | +| 13 | Duplicated delete confirmation logic (3 different patterns) | Frontend Expert | +| 14 | Raw SQL in `resolve_model()` breaking db-module abstraction | Tech Lead | +| 15 | `UpdateThemeRequest` has no validation | Rust Expert | +| 16 | Email validation too permissive (just checks `@`) | Rust Expert | +| 17 | Scheduler runs jobs sequentially — one blocks all | Rust Expert | +| 18 | 10 failing frontend unit tests | QA Engineer | +| 19 | `normalizeUrl`/`isValidUrl` utilities live in page files | Frontend Expert | + +--- + +## Positive Highlights + +All auditors noted these strengths: +- **Zero unsafe code**, zero unwrap in production paths +- **Strong security**: AES-256-GCM, SSRF with 74 tests, CSRF, anti-enumeration +- **Clean `AppError` enum** — consistent JSON responses, internal errors never leaked +- **Well-designed `LlmProvider` trait** — Strategy + Factory patterns +- **689 tests** — good coverage on scraper, providers, encryption, CRUD +- **SSE utility** is exemplary SolidJS reactive code +- **Clean API client layer** — type-safe, consistent patterns +- **Excellent module documentation** — `//!` doc comments on every module + +--- + +## Detailed Reports + +- [Architect Report](v2-architect-report.md) — SOLID principles, design patterns, architecture +- [Frontend Report](v2-frontend-report.md) — SolidJS patterns, component architecture +- [Rust Expert Report](v2-rust-expert-report.md) — Idiomatic Rust, async, safety +- [Tech Lead Report](v2-tech-lead-report.md) — Complexity, duplication, refactoring plan +- [QA Report](v2-qa-report.md) — Test coverage gaps (15 gaps found) diff --git a/docs/audit/v2-frontend-report.md b/docs/audit/v2-frontend-report.md new file mode 100644 index 0000000..e32d1f1 --- /dev/null +++ b/docs/audit/v2-frontend-report.md @@ -0,0 +1,443 @@ +# Frontend Audit Report v2 -- SolidJS + Tailwind CSS + +**Date**: 2026-03-27 +**Scope**: Pages (ThemeManager, Settings, GenerateSynthesis, Home, SynthesisDetail), settings components (SettingsSchedule, SettingsBraveSearch, SettingsRateLimit), ApiKeyManager, API layer, types, utilities +**Codebase snapshot**: commit `7c8a196` + +--- + +## Executive Summary + +The frontend is well-structured for a learning project. SolidJS primitives are used correctly and idiomatically in most places. The API layer is clean, the type system is comprehensive, and the i18n approach is sound. The main areas for improvement are: oversized page components that mix too many concerns, inconsistent feedback patterns (inline messages vs. toasts), a few SolidJS reactivity subtleties, and opportunities for better component reuse. + +--- + +## 1. SolidJS Patterns + +### 1.1 Reactive Primitives -- Correct Usage + +Signals, effects, and resources are used appropriately throughout. Notable good patterns: + +- **`createResource`** in `Settings.tsx` for providers/API keys -- idiomatic for async data that should auto-refetch. +- **`createMemo`** in `Home.tsx` (`groupedByTheme`) -- correctly memoizes an expensive grouping computation. +- **`createSSEConnection`** in `utils/sse.ts` -- elegantly wraps EventSource into reactive signals with `onCleanup`. This is textbook SolidJS. +- **Optimistic updates** in `ThemeManager.tsx` (`handleTogglePreferred`) -- updates local state first, reverts on error. Exactly the right pattern. + +### 1.2 Reactive Primitives -- Issues + +**ISSUE (Medium): `createEffect` triggering async side effects in `ThemeManager.tsx` (line 123-138)** + +```tsx +createEffect(() => { + const theme = selectedTheme(); + if (theme) { + // ...populate signals... + fetchSources(theme.id); // async fire-and-forget inside effect + } +}); +``` + +Calling an async function inside `createEffect` without tracking the returned promise means errors in `fetchSources` are silently swallowed by the effect (the `try/catch` inside `fetchSources` helps, but the pattern itself is fragile). More importantly, if `selectedTheme()` changes rapidly, multiple concurrent `fetchSources` calls can race and the final state may reflect a stale theme's sources. + +**Recommendation**: Debounce or cancel previous fetches. Consider using `createResource` keyed on `selectedThemeId()` instead, which natively handles this: +```tsx +const [sources] = createResource(selectedThemeId, (id) => sourcesApi.list(id)); +``` + +**ISSUE (Low): Duplicate `createEffect` for auto-selecting provider in `Settings.tsx` (lines 76-88 and 122-131)** + +Two separate effects handle provider-related logic. The first checks if the saved provider is still available; the second auto-selects when only one exists. These could be unified into a single effect for clarity. + +### 1.3 Show/For Usage -- Generally Correct + +- `` with the callback form `{(accessor) => ...}` is used correctly in most places (e.g., Settings.tsx line 232, ThemeManager.tsx line 489). +- `` is used everywhere lists are rendered -- correct. + +**ISSUE (Low): Multiple adjacent `` blocks with inverted conditions in `GenerateSynthesis.tsx` (lines 393-401)** + +```tsx +... +... +... +``` + +Three mutually exclusive `` blocks evaluate the same reactive expression three times. SolidJS's ``/`` is the idiomatic pattern for exclusive branching: +```tsx + + ... + ... + ... + +``` + +**ISSUE (Low): Nested `` chains in `SynthesisDetail.tsx` provenance section (lines 494-538)** + +Three nested `` blocks for loading/empty/data states. A `` would be cleaner. + +### 1.4 onCleanup -- Mostly Good + +- `ThemeManager.tsx` clears both `deleteTimer` and `deleteThemeTimer` on cleanup -- good. +- `Home.tsx` clears all delete timers from the record -- good. +- `SynthesisDetail.tsx` clears the email success timer -- good. +- `createSSEConnection` properly closes EventSource and clears retry timeout on cleanup -- excellent. + +**ISSUE (Medium): Missing `onCleanup` for navigation timer in `GenerateSynthesis.tsx`** + +The `onCleanup` on line 199 is placed inside a `createEffect`, which is correct for that specific timer. However, the SSE connection itself is only cleaned up via `onCleanup` inside `createSSEConnection`. If the component is destroyed while a generation is in flight and the user never calls `handleRetry`, the connection is properly cleaned up by the SSE utility's own `onCleanup`. This is fine -- just noting the implicit dependency. + +**ISSUE (Low): `emailTimer` stored as a signal in `SynthesisDetail.tsx` (line 134)** + +```tsx +const [emailTimer, setEmailTimer] = createSignal | undefined>(); +``` + +Timer IDs do not need to be reactive (nothing in the UI depends on the timer value). A plain `let` variable would be simpler, as done in `ThemeManager.tsx` and `Home.tsx`. The signal causes an unnecessary subscription. + +--- + +## 2. Component Architecture + +### 2.1 Oversized Components + +**ISSUE (High): `ThemeManager.tsx` -- 935 lines, ~30 signals, single monolithic component** + +This is the largest and most complex page. It manages: +- Theme CRUD (list, create, save, delete) +- Content settings editing (name, topic, categories, max age, max items, summary length) +- Source CRUD (add, delete, toggle preferred) +- Bulk import (text-based) +- CSV import/export +- Schedule delegation (via `SettingsSchedule`) + +The file defines ~30 `createSignal` calls (lines 42-81). This is a strong signal that the component should be decomposed. Suggested extraction: + +| Extracted Component | Signals Moved | Lines Saved | +|---|---|---| +| `ThemeContentForm` | editName, editThemeTopic, editCategories, editMaxAge, editMaxItems, editSummaryLength, newCategory, savingTheme, themeMessage | ~180 | +| `ThemeSourceList` | sources, loadingSources, newTitle, newUrl, adding, addError, confirmingDeleteId, deleteTimer | ~230 | +| `ThemeBulkImport` | bulkText, importing, importError | ~50 | +| `ThemeCsvImport` | csvError, fileInputRef, importing | ~40 | + +**ISSUE (Medium): `SynthesisDetail.tsx` -- 548 lines, manages email, export, delete, provenance, and display level** + +This component could benefit from extracting: +- `SynthesisEmailSection` (email state + send logic) +- `SynthesisExportSection` (markdown + PDF export) +- `SynthesisProvenanceSection` (provenance data + table) + +The `NewsItemCard` and `Section` sub-components are already well-extracted within the file (lines 36-105) -- good pattern. + +**ISSUE (Medium): `Settings.tsx` -- 694 lines** + +Already partially decomposed via `SettingsBraveSearch`, `SettingsRateLimit`, and `ApiKeyManager`. However, the remaining provider selection card (lines 356-512) is ~160 lines of dense JSX with deeply nested `` blocks. It could become a `SettingsProviderCard` component. + +**ISSUE (Low): `GenerateSynthesis.tsx` -- 471 lines** + +Acceptable size. The `completedSteps()` computation (lines 135-169) is somewhat complex but well-documented. The step checklist UI could be a separate `StepChecklist` component for readability. + +### 2.2 State Management + +The application uses a lightweight approach: `AuthContext` for global auth state, `ToastProvider` for notifications, and local signals within each page. This is appropriate for the app's complexity level. There is no global state management library, and none is needed. + +**Positive**: The `useToast` context is used consistently in newer components (`SettingsSchedule`, `SettingsBraveSearch`, `ApiKeyManager`). + +### 2.3 Reusability + +**Good reuse patterns:** +- `Button` component with variant/loading/icon props -- used across pages. +- `LoadingSpinner` with optional `fullPage` prop -- used everywhere. +- `SettingsSchedule`, `SettingsBraveSearch`, `SettingsRateLimit` -- well-scoped settings sub-components. + +**ISSUE (Medium): Duplicated inline button styles** + +Several pages define raw `