# Design: Architect Assessment Remediation **Date**: 2026-03-22 **Source**: docs/architect_assessment.md **Scope**: 4 fixes — 2 API contract breaks, 1 SSRF vulnerability, 1 test reliability improvement --- ## Context An architect assessment identified issues in the current codebase. After exploration, 3 are real issues, 1 is a test quality problem, 2 were already fixed, and 1 is not applicable (in-memory state is intentional for single-tenant design). **Dropped from scope:** - Dedicated hardened HTTP client — already implemented (`build_scraper_client()` with timeouts, redirect limits, User-Agent) - Streaming body limits — **accepted risk**. The current code downloads the full response into memory via `response.bytes()` then checks `bytes.len() > MAX_BODY_SIZE` (5 MB). A malicious server could exhaust memory before the check runs. True streaming would require a chunked reader with an accumulator cap. Accepted because: reqwest's `.timeout()` (15s) limits the window, and the 5 MB practical limit is enforced. A multi-GB response would hit the timeout before filling memory. - Job/rate-limit state backing to Redis/DB — not applicable per decision Q-UX-3 (single-tenant, single-instance) --- ## Fix 1: `/syntheses` list contract alignment ### Problem Backend `SynthesisListItem` returns preview metadata (`first_section_title: Option`, `first_section_item_count: usize`). Frontend `SynthesisListItem` expects `sections: NewsSection[]`. The frontend tries to index into `sections[0]` which doesn't exist in the API response. ### Solution Fix the frontend to align with the backend response. ### Backend (no changes) The backend design is correct — returning only preview data for list views is the right performance trade-off. ### Frontend changes **`frontend/src/types.ts`** — update `SynthesisListItem`: ```typescript export interface SynthesisListItem { id: string; week: string; status: string; created_at: string; first_section_title: string | null; first_section_item_count: number; } ``` Remove `sections: NewsSection[]` from this type. (The full `Synthesis` type keeps `sections` — it's used by the detail view.) **`frontend/src/pages/Home.tsx`** — update card preview rendering: - Replace `synth.sections?.[0]?.items?.length > 0` with `synth.first_section_title !== null` - Replace `synth.sections[0].items.slice(0, 4).map(...)` with a simple display of `first_section_title` and `first_section_item_count` - Card preview shows: "{first_section_title} ({first_section_item_count} articles)" or the no-preview fallback ### Files to modify - `frontend/src/types.ts` - `frontend/src/pages/Home.tsx` --- ## Fix 2: Admin rate-limits update path ### Problem Backend route is `PUT /admin/rate-limits/{provider_name}` (expects a string like "gemini"). Frontend calls `api.put(/admin/rate-limits/${id})` passing a UUID. The request 404s because no provider has a UUID as its name. ### Solution Fix the frontend to pass `provider_name` instead of `id`. ### Backend (no changes) The backend design is correct — rate limits are keyed by provider name. ### Frontend changes **`frontend/src/api/admin.ts`** — update the rate limits API: ```typescript // Before: update: (id: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${id}`, data) // After: update: (providerName: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${providerName}`, data) ``` **`frontend/src/pages/admin/RateLimits.tsx`** — update **only the API call** in the save handler to pass `limit.provider_name` instead of `limit.id`. The `limit.id` field continues to be used for UI state management (`savingId`, `updateLocal`) — those do NOT change. ### Files to modify - `frontend/src/api/admin.ts` - `frontend/src/pages/admin/RateLimits.tsx` --- ## Fix 3: SSRF redirect validation per hop ### Problem The scraper performs SSRF checks (DNS resolution + `is_private_ip()`) only on the initial URL. Reqwest then follows up to 3 redirects without re-checking. An attacker can host a public domain that 302-redirects to `http://169.254.169.254/` (cloud metadata) or `http://127.0.0.1/`. ### Solution Replace reqwest's default redirect policy with a custom one that validates each hop. ### Implementation **`backend/src/services/scraper.rs`** — update `build_scraper_client()`: Replace `redirect(reqwest::redirect::Policy::limited(3))` with a custom policy: ```rust use std::net::ToSocketAddrs; let redirect_policy = reqwest::redirect::Policy::custom(|attempt| { // Enforce max 3 hops if attempt.previous().len() >= 3 { return attempt.error("Too many redirects"); } let url = attempt.url(); // Validate scheme — reject non-HTTP redirects if url.scheme() != "http" && url.scheme() != "https" { return attempt.error("Blocked redirect to non-HTTP scheme"); } // Resolve DNS and check for private IPs (sync — acceptable in redirect handler) if let Some(host) = url.host_str() { let port = url.port().unwrap_or(if url.scheme() == "https" { 443 } else { 80 }); let addr_str = format!("{}:{}", host, port); match addr_str.to_socket_addrs() { Ok(addrs) => { for addr in addrs { if is_private_ip(addr.ip()) { return attempt.error("Blocked redirect to private/internal IP"); } } } Err(_) => { // DNS resolution failed — block the redirect to prevent TOCTOU/rebinding attacks. // If DNS fails here but succeeds when reqwest retries, the resolved IP could // be different (DNS rebinding). Safer to fail closed. return attempt.error("DNS resolution failed for redirect target"); } } } attempt.follow() }); ``` **Key decisions:** - **`attempt.error()` instead of `attempt.stop()`**: `stop()` silently returns the 3xx response (no body, scraper tries to parse garbage). `error()` produces a reqwest error that propagates to the caller as an `AppError`, which is the correct behavior for a blocked SSRF attempt. - **DNS failure blocks the redirect**: allowing a redirect when DNS fails opens a TOCTOU window (DNS rebinding). Fail closed. The `is_private_ip()` function is already defined and covers all IPv4/IPv6 private ranges (loopback, private, link-local, ULA, documentation, discard). It's a module-level function, accessible from the redirect policy closure. **Note on sync DNS:** reqwest's redirect policy closure is synchronous. `ToSocketAddrs` performs a blocking DNS lookup. This is acceptable because: 1. Redirects happen rarely (most URLs don't redirect) 2. DNS lookups are fast (< 100ms typically) 3. The alternative (disabling all redirects) breaks legitimate sites **Residual risk (DNS rebinding TOCTOU):** Even with per-hop DNS checks, there is a theoretical TOCTOU gap: the DNS check resolves to a public IP, then reqwest's actual connection resolves to a private IP. Fully preventing this would require a custom reqwest connector that validates IPs at the TCP connect level. This is accepted as a known limitation — the per-hop check raises the bar significantly from the current zero-check state. ### Tests - The `is_private_ip()` function is already thoroughly tested (13 tests covering all IP ranges) - Add a doc comment on `build_scraper_client()` explaining the per-hop SSRF protection ### Files to modify - `backend/src/services/scraper.rs` --- ## Fix 4: Shared test fixtures to prevent mock drift ### Problem Frontend test mocks are hand-crafted inline in each test file. They've already diverged from backend response shapes (e.g., Home.tsx mocks include `sections` which the backend doesn't return in list responses). This divergence means tests pass but the real app would fail. ### Solution Create a shared fixtures file typed against the actual TypeScript interfaces. All page tests import from it. Type changes in `types.ts` cause compilation failures in the fixture file, catching drift immediately. ### Implementation **Create `frontend/src/__tests__/fixtures.ts`**: ```typescript import type { SynthesisListItem, Synthesis, Source, UserSettings, ProviderConfig, AdminProvider, AdminRateLimit, } from '~/types'; // ---- Syntheses ---- export const MOCK_SYNTHESIS_LIST_ITEM: SynthesisListItem = { id: 'test-synth-1', week: '2026-W12', status: 'completed', created_at: '2026-03-21T10:00:00Z', first_section_title: 'Annonces majeures', first_section_item_count: 3, }; export const MOCK_SYNTHESIS_LIST: SynthesisListItem[] = [ MOCK_SYNTHESIS_LIST_ITEM, { ...MOCK_SYNTHESIS_LIST_ITEM, id: 'test-synth-2', week: '2026-W11' }, ]; // ---- Sources ---- // Fill with valid values matching the Source interface export const MOCK_SOURCE: Source = { id: 'src-1', user_id: 'u1', title: 'Test Blog', url: 'https://test.example.com/blog', created_at: '2026-03-21T10:00:00Z' }; // ---- Settings ---- // Use DEFAULT_SETTINGS as base, override specific fields export const MOCK_SETTINGS: UserSettings = { ...DEFAULT_SETTINGS, theme: 'Test Theme', ai_provider: 'gemini', ai_model: 'gemini-2.5-flash' }; // ---- Providers ---- export const MOCK_PROVIDER_CONFIG: ProviderConfig = { provider_name: 'gemini', display_name: 'Google Gemini', models: [{ model_id: 'gemini-2.5-flash', display_name: 'Gemini 2.5 Flash' }] }; ``` **Update all page test files** to import from fixtures instead of defining inline mocks. ### Files to modify - Create: `frontend/src/__tests__/fixtures.ts` - Modify: `frontend/src/__tests__/pages/home.test.tsx` - Modify: `frontend/src/__tests__/pages/settings.test.tsx` - Modify: `frontend/src/__tests__/pages/sources.test.tsx` - Modify: `frontend/src/__tests__/pages/generate.test.tsx` --- ## Implementation Order 1. **Fix 1** (syntheses list contract) — must be first because it changes `SynthesisListItem` type used by fixtures 2. **Fix 2** (rate-limits path) — independent, quick 3. **Fix 4** (shared fixtures) — depends on Fix 1 for the corrected type 4. **Fix 3** (SSRF redirect) — independent, backend-only --- ## Verification ### Backend ```bash cd backend && cargo check && cargo test --lib ``` ### Frontend ```bash cd frontend && npx tsc --noEmit && npx vitest run && npx vite build ```