Add architect remediation design spec
4 fixes: syntheses list contract alignment (frontend to backend), admin rate-limits path fix (provider_name not id), SSRF redirect per-hop validation with attempt.error(), shared test fixtures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>master
parent
7769f56410
commit
d20fa20ed2
@ -0,0 +1,242 @@
|
||||
# Design: Architect Assessment Remediation
|
||||
|
||||
**Date**: 2026-03-22
|
||||
**Source**: docs/architect_assessment.md
|
||||
**Scope**: 4 fixes — 2 API contract breaks, 1 SSRF vulnerability, 1 test reliability improvement
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
An architect assessment identified issues in the current codebase. After exploration, 3 are real issues, 1 is a test quality problem, 2 were already fixed, and 1 is not applicable (in-memory state is intentional for single-tenant design).
|
||||
|
||||
**Dropped from scope:**
|
||||
- Dedicated hardened HTTP client — already implemented (`build_scraper_client()` with timeouts, redirect limits, User-Agent)
|
||||
- Streaming body limits — **accepted risk**. The current code downloads the full response into memory via `response.bytes()` then checks `bytes.len() > MAX_BODY_SIZE` (5 MB). A malicious server could exhaust memory before the check runs. True streaming would require a chunked reader with an accumulator cap. Accepted because: reqwest's `.timeout()` (15s) limits the window, and the 5 MB practical limit is enforced. A multi-GB response would hit the timeout before filling memory.
|
||||
- Job/rate-limit state backing to Redis/DB — not applicable per decision Q-UX-3 (single-tenant, single-instance)
|
||||
|
||||
---
|
||||
|
||||
## Fix 1: `/syntheses` list contract alignment
|
||||
|
||||
### Problem
|
||||
Backend `SynthesisListItem` returns preview metadata (`first_section_title: Option<String>`, `first_section_item_count: usize`). Frontend `SynthesisListItem` expects `sections: NewsSection[]`. The frontend tries to index into `sections[0]` which doesn't exist in the API response.
|
||||
|
||||
### Solution
|
||||
Fix the frontend to align with the backend response.
|
||||
|
||||
### Backend (no changes)
|
||||
The backend design is correct — returning only preview data for list views is the right performance trade-off.
|
||||
|
||||
### Frontend changes
|
||||
|
||||
**`frontend/src/types.ts`** — update `SynthesisListItem`:
|
||||
```typescript
|
||||
export interface SynthesisListItem {
|
||||
id: string;
|
||||
week: string;
|
||||
status: string;
|
||||
created_at: string;
|
||||
first_section_title: string | null;
|
||||
first_section_item_count: number;
|
||||
}
|
||||
```
|
||||
Remove `sections: NewsSection[]` from this type. (The full `Synthesis` type keeps `sections` — it's used by the detail view.)
|
||||
|
||||
**`frontend/src/pages/Home.tsx`** — update card preview rendering:
|
||||
- Replace `synth.sections?.[0]?.items?.length > 0` with `synth.first_section_title !== null`
|
||||
- Replace `synth.sections[0].items.slice(0, 4).map(...)` with a simple display of `first_section_title` and `first_section_item_count`
|
||||
- Card preview shows: "{first_section_title} ({first_section_item_count} articles)" or the no-preview fallback
|
||||
|
||||
### Files to modify
|
||||
- `frontend/src/types.ts`
|
||||
- `frontend/src/pages/Home.tsx`
|
||||
|
||||
---
|
||||
|
||||
## Fix 2: Admin rate-limits update path
|
||||
|
||||
### Problem
|
||||
Backend route is `PUT /admin/rate-limits/{provider_name}` (expects a string like "gemini"). Frontend calls `api.put(/admin/rate-limits/${id})` passing a UUID. The request 404s because no provider has a UUID as its name.
|
||||
|
||||
### Solution
|
||||
Fix the frontend to pass `provider_name` instead of `id`.
|
||||
|
||||
### Backend (no changes)
|
||||
The backend design is correct — rate limits are keyed by provider name.
|
||||
|
||||
### Frontend changes
|
||||
|
||||
**`frontend/src/api/admin.ts`** — update the rate limits API:
|
||||
```typescript
|
||||
// Before:
|
||||
update: (id: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${id}`, data)
|
||||
// After:
|
||||
update: (providerName: string, data: UpdateRateLimitRequest) => api.put(`/admin/rate-limits/${providerName}`, data)
|
||||
```
|
||||
|
||||
**`frontend/src/pages/admin/RateLimits.tsx`** — update **only the API call** in the save handler to pass `limit.provider_name` instead of `limit.id`. The `limit.id` field continues to be used for UI state management (`savingId`, `updateLocal`) — those do NOT change.
|
||||
|
||||
### Files to modify
|
||||
- `frontend/src/api/admin.ts`
|
||||
- `frontend/src/pages/admin/RateLimits.tsx`
|
||||
|
||||
---
|
||||
|
||||
## Fix 3: SSRF redirect validation per hop
|
||||
|
||||
### Problem
|
||||
The scraper performs SSRF checks (DNS resolution + `is_private_ip()`) only on the initial URL. Reqwest then follows up to 3 redirects without re-checking. An attacker can host a public domain that 302-redirects to `http://169.254.169.254/` (cloud metadata) or `http://127.0.0.1/`.
|
||||
|
||||
### Solution
|
||||
Replace reqwest's default redirect policy with a custom one that validates each hop.
|
||||
|
||||
### Implementation
|
||||
|
||||
**`backend/src/services/scraper.rs`** — update `build_scraper_client()`:
|
||||
|
||||
Replace `redirect(reqwest::redirect::Policy::limited(3))` with a custom policy:
|
||||
|
||||
```rust
|
||||
use std::net::ToSocketAddrs;
|
||||
|
||||
let redirect_policy = reqwest::redirect::Policy::custom(|attempt| {
|
||||
// Enforce max 3 hops
|
||||
if attempt.previous().len() >= 3 {
|
||||
return attempt.error("Too many redirects");
|
||||
}
|
||||
|
||||
let url = attempt.url();
|
||||
|
||||
// Validate scheme — reject non-HTTP redirects
|
||||
if url.scheme() != "http" && url.scheme() != "https" {
|
||||
return attempt.error("Blocked redirect to non-HTTP scheme");
|
||||
}
|
||||
|
||||
// Resolve DNS and check for private IPs (sync — acceptable in redirect handler)
|
||||
if let Some(host) = url.host_str() {
|
||||
let port = url.port().unwrap_or(if url.scheme() == "https" { 443 } else { 80 });
|
||||
let addr_str = format!("{}:{}", host, port);
|
||||
match addr_str.to_socket_addrs() {
|
||||
Ok(addrs) => {
|
||||
for addr in addrs {
|
||||
if is_private_ip(addr.ip()) {
|
||||
return attempt.error("Blocked redirect to private/internal IP");
|
||||
}
|
||||
}
|
||||
}
|
||||
Err(_) => {
|
||||
// DNS resolution failed — block the redirect to prevent TOCTOU/rebinding attacks.
|
||||
// If DNS fails here but succeeds when reqwest retries, the resolved IP could
|
||||
// be different (DNS rebinding). Safer to fail closed.
|
||||
return attempt.error("DNS resolution failed for redirect target");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
attempt.follow()
|
||||
});
|
||||
```
|
||||
|
||||
**Key decisions:**
|
||||
- **`attempt.error()` instead of `attempt.stop()`**: `stop()` silently returns the 3xx response (no body, scraper tries to parse garbage). `error()` produces a reqwest error that propagates to the caller as an `AppError`, which is the correct behavior for a blocked SSRF attempt.
|
||||
- **DNS failure blocks the redirect**: allowing a redirect when DNS fails opens a TOCTOU window (DNS rebinding). Fail closed.
|
||||
|
||||
The `is_private_ip()` function is already defined and covers all IPv4/IPv6 private ranges (loopback, private, link-local, ULA, documentation, discard). It's a module-level function, accessible from the redirect policy closure.
|
||||
|
||||
**Note on sync DNS:** reqwest's redirect policy closure is synchronous. `ToSocketAddrs` performs a blocking DNS lookup. This is acceptable because:
|
||||
1. Redirects happen rarely (most URLs don't redirect)
|
||||
2. DNS lookups are fast (< 100ms typically)
|
||||
3. The alternative (disabling all redirects) breaks legitimate sites
|
||||
|
||||
**Residual risk (DNS rebinding TOCTOU):** Even with per-hop DNS checks, there is a theoretical TOCTOU gap: the DNS check resolves to a public IP, then reqwest's actual connection resolves to a private IP. Fully preventing this would require a custom reqwest connector that validates IPs at the TCP connect level. This is accepted as a known limitation — the per-hop check raises the bar significantly from the current zero-check state.
|
||||
|
||||
### Tests
|
||||
- The `is_private_ip()` function is already thoroughly tested (13 tests covering all IP ranges)
|
||||
- Add a doc comment on `build_scraper_client()` explaining the per-hop SSRF protection
|
||||
|
||||
### Files to modify
|
||||
- `backend/src/services/scraper.rs`
|
||||
|
||||
---
|
||||
|
||||
## Fix 4: Shared test fixtures to prevent mock drift
|
||||
|
||||
### Problem
|
||||
Frontend test mocks are hand-crafted inline in each test file. They've already diverged from backend response shapes (e.g., Home.tsx mocks include `sections` which the backend doesn't return in list responses). This divergence means tests pass but the real app would fail.
|
||||
|
||||
### Solution
|
||||
Create a shared fixtures file typed against the actual TypeScript interfaces. All page tests import from it. Type changes in `types.ts` cause compilation failures in the fixture file, catching drift immediately.
|
||||
|
||||
### Implementation
|
||||
|
||||
**Create `frontend/src/__tests__/fixtures.ts`**:
|
||||
```typescript
|
||||
import type {
|
||||
SynthesisListItem,
|
||||
Synthesis,
|
||||
Source,
|
||||
UserSettings,
|
||||
ProviderConfig,
|
||||
AdminProvider,
|
||||
AdminRateLimit,
|
||||
} from '~/types';
|
||||
|
||||
// ---- Syntheses ----
|
||||
export const MOCK_SYNTHESIS_LIST_ITEM: SynthesisListItem = {
|
||||
id: 'test-synth-1',
|
||||
week: '2026-W12',
|
||||
status: 'completed',
|
||||
created_at: '2026-03-21T10:00:00Z',
|
||||
first_section_title: 'Annonces majeures',
|
||||
first_section_item_count: 3,
|
||||
};
|
||||
|
||||
export const MOCK_SYNTHESIS_LIST: SynthesisListItem[] = [
|
||||
MOCK_SYNTHESIS_LIST_ITEM,
|
||||
{ ...MOCK_SYNTHESIS_LIST_ITEM, id: 'test-synth-2', week: '2026-W11' },
|
||||
];
|
||||
|
||||
// ---- Sources ----
|
||||
// Fill with valid values matching the Source interface
|
||||
export const MOCK_SOURCE: Source = { id: 'src-1', user_id: 'u1', title: 'Test Blog', url: 'https://test.example.com/blog', created_at: '2026-03-21T10:00:00Z' };
|
||||
|
||||
// ---- Settings ----
|
||||
// Use DEFAULT_SETTINGS as base, override specific fields
|
||||
export const MOCK_SETTINGS: UserSettings = { ...DEFAULT_SETTINGS, theme: 'Test Theme', ai_provider: 'gemini', ai_model: 'gemini-2.5-flash' };
|
||||
|
||||
// ---- Providers ----
|
||||
export const MOCK_PROVIDER_CONFIG: ProviderConfig = { provider_name: 'gemini', display_name: 'Google Gemini', models: [{ model_id: 'gemini-2.5-flash', display_name: 'Gemini 2.5 Flash' }] };
|
||||
```
|
||||
|
||||
**Update all page test files** to import from fixtures instead of defining inline mocks.
|
||||
|
||||
### Files to modify
|
||||
- Create: `frontend/src/__tests__/fixtures.ts`
|
||||
- Modify: `frontend/src/__tests__/pages/home.test.tsx`
|
||||
- Modify: `frontend/src/__tests__/pages/settings.test.tsx`
|
||||
- Modify: `frontend/src/__tests__/pages/sources.test.tsx`
|
||||
- Modify: `frontend/src/__tests__/pages/generate.test.tsx`
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. **Fix 1** (syntheses list contract) — must be first because it changes `SynthesisListItem` type used by fixtures
|
||||
2. **Fix 2** (rate-limits path) — independent, quick
|
||||
3. **Fix 4** (shared fixtures) — depends on Fix 1 for the corrected type
|
||||
4. **Fix 3** (SSRF redirect) — independent, backend-only
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
### Backend
|
||||
```bash
|
||||
cd backend && cargo check && cargo test --lib
|
||||
```
|
||||
|
||||
### Frontend
|
||||
```bash
|
||||
cd frontend && npx tsc --noEmit && npx vitest run && npx vite build
|
||||
```
|
||||
Loading…
Reference in New Issue