Add architect remediation design spec

4 fixes: syntheses list contract alignment (frontend to backend), admin rate-limits path fix (provider_name not id), SSRF redirect per-hop validation with attempt.error(), shared test fixtures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago · d20fa20ed2
parent 7769f56410
commit d20fa20ed2
1 changed files with 242 additions and 0 deletions
--- a/docs/superpowers/specs/2026-03-22-architect-remediation-design.md
+++ b/docs/superpowers/specs/2026-03-22-architect-remediation-design.md
@ -0,0 +1,242 @@
+# Design: Architect Assessment Remediation
+
+**Date**: 2026-03-22
+**Source**: docs/architect_assessment.md
+**Scope**: 4 fixes — 2 API contract breaks, 1 SSRF vulnerability, 1 test reliability improvement
+
+---
+
+## Context
+
+An architect assessment identified issues in the current codebase. After exploration, 3 are real issues, 1 is a test quality problem, 2 were already fixed, and 1 is not applicable (in-memory state is intentional for single-tenant design).
+
+**Dropped from scope:**
+- Dedicated hardened HTTP client — already implemented (`build_scraper_client()` with timeouts, redirect limits, User-Agent)
+- Streaming body limits — **accepted risk**. The current code downloads the full response into memory via `response.bytes()` then checks `bytes.len() > MAX_BODY_SIZE` (5 MB). A malicious server could exhaust memory before the check runs. True streaming would require a chunked reader with an accumulator cap. Accepted because: reqwest's `.timeout()` (15s) limits the window, and the 5 MB practical limit is enforced. A multi-GB response would hit the timeout before filling memory.
+- Job/rate-limit state backing to Redis/DB — not applicable per decision Q-UX-3 (single-tenant, single-instance)
+
+---
+
+## Fix 1: `/syntheses` list contract alignment
+
+### Problem
+Backend `SynthesisListItem` returns preview metadata (`first_section_title: Option<String>`, `first_section_item_count: usize`). Frontend `SynthesisListItem` expects `sections: NewsSection[]`. The frontend tries to index into `sections[0]` which doesn't exist in the API response.
+
+### Solution
+Fix the frontend to align with the backend response.
+
+### Backend (no changes)
+The backend design is correct — returning only preview data for list views is the right performance trade-off.
+
+### Frontend changes
+
+**`frontend/src/types.ts`** — update `SynthesisListItem`:
+```typescript
+export interface SynthesisListItem {
+  id: string;
+  week: string;
+  status: string;
+  created_at: string;
+  first_section_title: string | null;
+  first_section_item_count: number;
+}
+```
+Remove `sections: NewsSection[]` from this type. (The full `Synthesis` type keeps `sections` — it's used by the detail view.)
+
+**`frontend/src/pages/Home.tsx`** — update card preview rendering:
+- Replace `synth.sections?.[0]?.items?.length > 0` with `synth.first_section_title !== null`
+- Replace `synth.sections[0].items.slice(0, 4).map(...)` with a simple display of `first_section_title` and `first_section_item_count`
+- Card preview shows: "{first_section_title} ({first_section_item_count} articles)" or the no-preview fallback
+
+### Files to modify
+- `frontend/src/types.ts`
+- `frontend/src/pages/Home.tsx`
+
+---
+
+## Fix 2: Admin rate-limits update path
+
+### Problem
+Backend route is `PUT /admin/rate-limits/{provider_name}` (expects a string like "gemini"). Frontend calls `api.put(/admin/rate-limits/${id})` passing a UUID. The request 404s because no provider has a UUID as its name.
+
+### Solution
+Fix the frontend to pass `provider_name` instead of `id`.
+
+### Backend (no changes)
+The backend design is correct — rate limits are keyed by provider name.
+
+### Frontend changes
+
+**`frontend/src/api/admin.ts`** — update the rate limits API:
+```typescript
+// Before:
+update: (id: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${id}`, data)
+// After:
+update: (providerName: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${providerName}`, data)
+```
+
+**`frontend/src/pages/admin/RateLimits.tsx`** — update **only the API call** in the save handler to pass `limit.provider_name` instead of `limit.id`. The `limit.id` field continues to be used for UI state management (`savingId`, `updateLocal`) — those do NOT change.
+
+### Files to modify
+- `frontend/src/api/admin.ts`
+- `frontend/src/pages/admin/RateLimits.tsx`
+
+---
+
+## Fix 3: SSRF redirect validation per hop
+
+### Problem
+The scraper performs SSRF checks (DNS resolution + `is_private_ip()`) only on the initial URL. Reqwest then follows up to 3 redirects without re-checking. An attacker can host a public domain that 302-redirects to `http://169.254.169.254/` (cloud metadata) or `http://127.0.0.1/`.
+
+### Solution
+Replace reqwest's default redirect policy with a custom one that validates each hop.
+
+### Implementation
+
+**`backend/src/services/scraper.rs`** — update `build_scraper_client()`:
+
+Replace `redirect(reqwest::redirect::Policy::limited(3))` with a custom policy:
+
+```rust
+use std::net::ToSocketAddrs;
+
+let redirect_policy = reqwest::redirect::Policy::custom(|attempt| {
+    // Enforce max 3 hops
+    if attempt.previous().len() >= 3 {
+        return attempt.error("Too many redirects");
+    }
+
+    let url = attempt.url();
+
+    // Validate scheme — reject non-HTTP redirects
+    if url.scheme() != "http" && url.scheme() != "https" {
+        return attempt.error("Blocked redirect to non-HTTP scheme");
+    }
+
+    // Resolve DNS and check for private IPs (sync — acceptable in redirect handler)
+    if let Some(host) = url.host_str() {
+        let port = url.port().unwrap_or(if url.scheme() == "https" { 443 } else { 80 });
+        let addr_str = format!("{}:{}", host, port);
+        match addr_str.to_socket_addrs() {
+            Ok(addrs) => {
+                for addr in addrs {
+                    if is_private_ip(addr.ip()) {
+                        return attempt.error("Blocked redirect to private/internal IP");
+                    }
+                }
+            }
+            Err(_) => {
+                // DNS resolution failed — block the redirect to prevent TOCTOU/rebinding attacks.
+                // If DNS fails here but succeeds when reqwest retries, the resolved IP could
+                // be different (DNS rebinding). Safer to fail closed.
+                return attempt.error("DNS resolution failed for redirect target");
+            }
+        }
+    }
+
+    attempt.follow()
+});
+```
+
+**Key decisions:**
+- **`attempt.error()` instead of `attempt.stop()`**: `stop()` silently returns the 3xx response (no body, scraper tries to parse garbage). `error()` produces a reqwest error that propagates to the caller as an `AppError`, which is the correct behavior for a blocked SSRF attempt.
+- **DNS failure blocks the redirect**: allowing a redirect when DNS fails opens a TOCTOU window (DNS rebinding). Fail closed.
+
+The `is_private_ip()` function is already defined and covers all IPv4/IPv6 private ranges (loopback, private, link-local, ULA, documentation, discard). It's a module-level function, accessible from the redirect policy closure.
+
+**Note on sync DNS:** reqwest's redirect policy closure is synchronous. `ToSocketAddrs` performs a blocking DNS lookup. This is acceptable because:
+1. Redirects happen rarely (most URLs don't redirect)
+2. DNS lookups are fast (< 100ms typically)
+3. The alternative (disabling all redirects) breaks legitimate sites
+
+**Residual risk (DNS rebinding TOCTOU):** Even with per-hop DNS checks, there is a theoretical TOCTOU gap: the DNS check resolves to a public IP, then reqwest's actual connection resolves to a private IP. Fully preventing this would require a custom reqwest connector that validates IPs at the TCP connect level. This is accepted as a known limitation — the per-hop check raises the bar significantly from the current zero-check state.
+
+### Tests
+- The `is_private_ip()` function is already thoroughly tested (13 tests covering all IP ranges)
+- Add a doc comment on `build_scraper_client()` explaining the per-hop SSRF protection
+
+### Files to modify
+- `backend/src/services/scraper.rs`
+
+---
+
+## Fix 4: Shared test fixtures to prevent mock drift
+
+### Problem
+Frontend test mocks are hand-crafted inline in each test file. They've already diverged from backend response shapes (e.g., Home.tsx mocks include `sections` which the backend doesn't return in list responses). This divergence means tests pass but the real app would fail.
+
+### Solution
+Create a shared fixtures file typed against the actual TypeScript interfaces. All page tests import from it. Type changes in `types.ts` cause compilation failures in the fixture file, catching drift immediately.
+
+### Implementation
+
+**Create `frontend/src/__tests__/fixtures.ts`**:
+```typescript
+import type {
+  SynthesisListItem,
+  Synthesis,
+  Source,
+  UserSettings,
+  ProviderConfig,
+  AdminProvider,
+  AdminRateLimit,
+} from '~/types';
+
+// ---- Syntheses ----
+export const MOCK_SYNTHESIS_LIST_ITEM: SynthesisListItem = {
+  id: 'test-synth-1',
+  week: '2026-W12',
+  status: 'completed',
+  created_at: '2026-03-21T10:00:00Z',
+  first_section_title: 'Annonces majeures',
+  first_section_item_count: 3,
+};
+
+export const MOCK_SYNTHESIS_LIST: SynthesisListItem[] = [
+  MOCK_SYNTHESIS_LIST_ITEM,
+  { ...MOCK_SYNTHESIS_LIST_ITEM, id: 'test-synth-2', week: '2026-W11' },
+];
+
+// ---- Sources ----
+// Fill with valid values matching the Source interface
+export const MOCK_SOURCE: Source = { id: 'src-1', user_id: 'u1', title: 'Test Blog', url: 'https://test.example.com/blog', created_at: '2026-03-21T10:00:00Z' };
+
+// ---- Settings ----
+// Use DEFAULT_SETTINGS as base, override specific fields
+export const MOCK_SETTINGS: UserSettings = { ...DEFAULT_SETTINGS, theme: 'Test Theme', ai_provider: 'gemini', ai_model: 'gemini-2.5-flash' };
+
+// ---- Providers ----
+export const MOCK_PROVIDER_CONFIG: ProviderConfig = { provider_name: 'gemini', display_name: 'Google Gemini', models: [{ model_id: 'gemini-2.5-flash', display_name: 'Gemini 2.5 Flash' }] };
+```
+
+**Update all page test files** to import from fixtures instead of defining inline mocks.
+
+### Files to modify
+- Create: `frontend/src/__tests__/fixtures.ts`
+- Modify: `frontend/src/__tests__/pages/home.test.tsx`
+- Modify: `frontend/src/__tests__/pages/settings.test.tsx`
+- Modify: `frontend/src/__tests__/pages/sources.test.tsx`
+- Modify: `frontend/src/__tests__/pages/generate.test.tsx`
+
+---
+
+## Implementation Order
+
+1. **Fix 1** (syntheses list contract) — must be first because it changes `SynthesisListItem` type used by fixtures
+2. **Fix 2** (rate-limits path) — independent, quick
+3. **Fix 4** (shared fixtures) — depends on Fix 1 for the corrected type
+4. **Fix 3** (SSRF redirect) — independent, backend-only
+
+---
+
+## Verification
+
+### Backend
+```bash
+cd backend && cargo check && cargo test --lib
+```
+
+### Frontend
+```bash
+cd frontend && npx tsc --noEmit && npx vitest run && npx vite build
+```