diff --git a/docs/superpowers/specs/2026-03-22-architect-remediation-design.md b/docs/superpowers/specs/2026-03-22-architect-remediation-design.md new file mode 100644 index 0000000..64be2af --- /dev/null +++ b/docs/superpowers/specs/2026-03-22-architect-remediation-design.md @@ -0,0 +1,242 @@ +# Design: Architect Assessment Remediation + +**Date**: 2026-03-22 +**Source**: docs/architect_assessment.md +**Scope**: 4 fixes — 2 API contract breaks, 1 SSRF vulnerability, 1 test reliability improvement + +--- + +## Context + +An architect assessment identified issues in the current codebase. After exploration, 3 are real issues, 1 is a test quality problem, 2 were already fixed, and 1 is not applicable (in-memory state is intentional for single-tenant design). + +**Dropped from scope:** +- Dedicated hardened HTTP client — already implemented (`build_scraper_client()` with timeouts, redirect limits, User-Agent) +- Streaming body limits — **accepted risk**. The current code downloads the full response into memory via `response.bytes()` then checks `bytes.len() > MAX_BODY_SIZE` (5 MB). A malicious server could exhaust memory before the check runs. True streaming would require a chunked reader with an accumulator cap. Accepted because: reqwest's `.timeout()` (15s) limits the window, and the 5 MB practical limit is enforced. A multi-GB response would hit the timeout before filling memory. +- Job/rate-limit state backing to Redis/DB — not applicable per decision Q-UX-3 (single-tenant, single-instance) + +--- + +## Fix 1: `/syntheses` list contract alignment + +### Problem +Backend `SynthesisListItem` returns preview metadata (`first_section_title: Option`, `first_section_item_count: usize`). Frontend `SynthesisListItem` expects `sections: NewsSection[]`. The frontend tries to index into `sections[0]` which doesn't exist in the API response. + +### Solution +Fix the frontend to align with the backend response. + +### Backend (no changes) +The backend design is correct — returning only preview data for list views is the right performance trade-off. + +### Frontend changes + +**`frontend/src/types.ts`** — update `SynthesisListItem`: +```typescript +export interface SynthesisListItem { + id: string; + week: string; + status: string; + created_at: string; + first_section_title: string | null; + first_section_item_count: number; +} +``` +Remove `sections: NewsSection[]` from this type. (The full `Synthesis` type keeps `sections` — it's used by the detail view.) + +**`frontend/src/pages/Home.tsx`** — update card preview rendering: +- Replace `synth.sections?.[0]?.items?.length > 0` with `synth.first_section_title !== null` +- Replace `synth.sections[0].items.slice(0, 4).map(...)` with a simple display of `first_section_title` and `first_section_item_count` +- Card preview shows: "{first_section_title} ({first_section_item_count} articles)" or the no-preview fallback + +### Files to modify +- `frontend/src/types.ts` +- `frontend/src/pages/Home.tsx` + +--- + +## Fix 2: Admin rate-limits update path + +### Problem +Backend route is `PUT /admin/rate-limits/{provider_name}` (expects a string like "gemini"). Frontend calls `api.put(/admin/rate-limits/${id})` passing a UUID. The request 404s because no provider has a UUID as its name. + +### Solution +Fix the frontend to pass `provider_name` instead of `id`. + +### Backend (no changes) +The backend design is correct — rate limits are keyed by provider name. + +### Frontend changes + +**`frontend/src/api/admin.ts`** — update the rate limits API: +```typescript +// Before: +update: (id: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${id}`, data) +// After: +update: (providerName: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${providerName}`, data) +``` + +**`frontend/src/pages/admin/RateLimits.tsx`** — update **only the API call** in the save handler to pass `limit.provider_name` instead of `limit.id`. The `limit.id` field continues to be used for UI state management (`savingId`, `updateLocal`) — those do NOT change. + +### Files to modify +- `frontend/src/api/admin.ts` +- `frontend/src/pages/admin/RateLimits.tsx` + +--- + +## Fix 3: SSRF redirect validation per hop + +### Problem +The scraper performs SSRF checks (DNS resolution + `is_private_ip()`) only on the initial URL. Reqwest then follows up to 3 redirects without re-checking. An attacker can host a public domain that 302-redirects to `http://169.254.169.254/` (cloud metadata) or `http://127.0.0.1/`. + +### Solution +Replace reqwest's default redirect policy with a custom one that validates each hop. + +### Implementation + +**`backend/src/services/scraper.rs`** — update `build_scraper_client()`: + +Replace `redirect(reqwest::redirect::Policy::limited(3))` with a custom policy: + +```rust +use std::net::ToSocketAddrs; + +let redirect_policy = reqwest::redirect::Policy::custom(|attempt| { + // Enforce max 3 hops + if attempt.previous().len() >= 3 { + return attempt.error("Too many redirects"); + } + + let url = attempt.url(); + + // Validate scheme — reject non-HTTP redirects + if url.scheme() != "http" && url.scheme() != "https" { + return attempt.error("Blocked redirect to non-HTTP scheme"); + } + + // Resolve DNS and check for private IPs (sync — acceptable in redirect handler) + if let Some(host) = url.host_str() { + let port = url.port().unwrap_or(if url.scheme() == "https" { 443 } else { 80 }); + let addr_str = format!("{}:{}", host, port); + match addr_str.to_socket_addrs() { + Ok(addrs) => { + for addr in addrs { + if is_private_ip(addr.ip()) { + return attempt.error("Blocked redirect to private/internal IP"); + } + } + } + Err(_) => { + // DNS resolution failed — block the redirect to prevent TOCTOU/rebinding attacks. + // If DNS fails here but succeeds when reqwest retries, the resolved IP could + // be different (DNS rebinding). Safer to fail closed. + return attempt.error("DNS resolution failed for redirect target"); + } + } + } + + attempt.follow() +}); +``` + +**Key decisions:** +- **`attempt.error()` instead of `attempt.stop()`**: `stop()` silently returns the 3xx response (no body, scraper tries to parse garbage). `error()` produces a reqwest error that propagates to the caller as an `AppError`, which is the correct behavior for a blocked SSRF attempt. +- **DNS failure blocks the redirect**: allowing a redirect when DNS fails opens a TOCTOU window (DNS rebinding). Fail closed. + +The `is_private_ip()` function is already defined and covers all IPv4/IPv6 private ranges (loopback, private, link-local, ULA, documentation, discard). It's a module-level function, accessible from the redirect policy closure. + +**Note on sync DNS:** reqwest's redirect policy closure is synchronous. `ToSocketAddrs` performs a blocking DNS lookup. This is acceptable because: +1. Redirects happen rarely (most URLs don't redirect) +2. DNS lookups are fast (< 100ms typically) +3. The alternative (disabling all redirects) breaks legitimate sites + +**Residual risk (DNS rebinding TOCTOU):** Even with per-hop DNS checks, there is a theoretical TOCTOU gap: the DNS check resolves to a public IP, then reqwest's actual connection resolves to a private IP. Fully preventing this would require a custom reqwest connector that validates IPs at the TCP connect level. This is accepted as a known limitation — the per-hop check raises the bar significantly from the current zero-check state. + +### Tests +- The `is_private_ip()` function is already thoroughly tested (13 tests covering all IP ranges) +- Add a doc comment on `build_scraper_client()` explaining the per-hop SSRF protection + +### Files to modify +- `backend/src/services/scraper.rs` + +--- + +## Fix 4: Shared test fixtures to prevent mock drift + +### Problem +Frontend test mocks are hand-crafted inline in each test file. They've already diverged from backend response shapes (e.g., Home.tsx mocks include `sections` which the backend doesn't return in list responses). This divergence means tests pass but the real app would fail. + +### Solution +Create a shared fixtures file typed against the actual TypeScript interfaces. All page tests import from it. Type changes in `types.ts` cause compilation failures in the fixture file, catching drift immediately. + +### Implementation + +**Create `frontend/src/__tests__/fixtures.ts`**: +```typescript +import type { + SynthesisListItem, + Synthesis, + Source, + UserSettings, + ProviderConfig, + AdminProvider, + AdminRateLimit, +} from '~/types'; + +// ---- Syntheses ---- +export const MOCK_SYNTHESIS_LIST_ITEM: SynthesisListItem = { + id: 'test-synth-1', + week: '2026-W12', + status: 'completed', + created_at: '2026-03-21T10:00:00Z', + first_section_title: 'Annonces majeures', + first_section_item_count: 3, +}; + +export const MOCK_SYNTHESIS_LIST: SynthesisListItem[] = [ + MOCK_SYNTHESIS_LIST_ITEM, + { ...MOCK_SYNTHESIS_LIST_ITEM, id: 'test-synth-2', week: '2026-W11' }, +]; + +// ---- Sources ---- +// Fill with valid values matching the Source interface +export const MOCK_SOURCE: Source = { id: 'src-1', user_id: 'u1', title: 'Test Blog', url: 'https://test.example.com/blog', created_at: '2026-03-21T10:00:00Z' }; + +// ---- Settings ---- +// Use DEFAULT_SETTINGS as base, override specific fields +export const MOCK_SETTINGS: UserSettings = { ...DEFAULT_SETTINGS, theme: 'Test Theme', ai_provider: 'gemini', ai_model: 'gemini-2.5-flash' }; + +// ---- Providers ---- +export const MOCK_PROVIDER_CONFIG: ProviderConfig = { provider_name: 'gemini', display_name: 'Google Gemini', models: [{ model_id: 'gemini-2.5-flash', display_name: 'Gemini 2.5 Flash' }] }; +``` + +**Update all page test files** to import from fixtures instead of defining inline mocks. + +### Files to modify +- Create: `frontend/src/__tests__/fixtures.ts` +- Modify: `frontend/src/__tests__/pages/home.test.tsx` +- Modify: `frontend/src/__tests__/pages/settings.test.tsx` +- Modify: `frontend/src/__tests__/pages/sources.test.tsx` +- Modify: `frontend/src/__tests__/pages/generate.test.tsx` + +--- + +## Implementation Order + +1. **Fix 1** (syntheses list contract) — must be first because it changes `SynthesisListItem` type used by fixtures +2. **Fix 2** (rate-limits path) — independent, quick +3. **Fix 4** (shared fixtures) — depends on Fix 1 for the corrected type +4. **Fix 3** (SSRF redirect) — independent, backend-only + +--- + +## Verification + +### Backend +```bash +cd backend && cargo check && cargo test --lib +``` + +### Frontend +```bash +cd frontend && npx tsc --noEmit && npx vitest run && npx vite build +```