You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

38 KiB

Raw Blame History

Phased Delivery Roadmap: AI Weekly Synth Rewrite

Date: 2026-03-21 Author: Phase Roadmap Planner Input: Team analysis (01-04) and project decisions (05)

Overview

This roadmap decomposes the AI Weekly Synth rewrite into 7 phases. Each phase produces a working, deployable application. The phases are ordered to deliver value incrementally: Phase 1 proves the stack works end-to-end, and each subsequent phase adds exactly one major capability.

The decisions document establishes: Rust (Axum) + Postgres + SolidJS + Tailwind CSS, all 3 LLM providers, user-provided API keys, email + magic link auth, Docker-only deployment, no data migration, and testing as part of the plan.

Dependency Graph

Phase 1: Foundation (Axum + Postgres + SolidJS + Auth + Settings CRUD)
   |
   +---> Phase 2: Sources CRUD + Scraper Service
   |        |
   |        +---> Phase 4: LLM Provider Abstraction (Gemini first)
   |                 |
   |                 +---> Phase 5: Generation Pipeline + SSE Progress
   |                 |        |
   |                 |        +---> Phase 6: Multi-Provider (OpenAI + Anthropic)
   |                 |
   +---> Phase 3: Admin Module (Provider/Model Curation, Rate Limits)
   |        |
   |        +---> Phase 4 (admin curates provider list that Phase 4 uses)
   |
   +-------------------------------+---> Phase 7: Email (Resend) + Export (PDF/Markdown)

Summary of dependencies:

Phase 2 depends on Phase 1 (auth, DB, frontend scaffolding)
Phase 3 depends on Phase 1 (auth with admin role, DB)
Phase 4 depends on Phase 2 (scraper) + Phase 3 (admin-curated provider/model list)
Phase 5 depends on Phase 4 (LLM provider working)
Phase 6 depends on Phase 5 (pipeline working with one provider)
Phase 7 depends on Phase 1 only (can be started after Phase 1, but best done after Phase 5 so there is content to email/export)

Risk-Ordered Priority

If time runs out, this is the order of criticality (most critical first):

Phase 1 -- Foundation: Without this, nothing works. This is the riskiest phase because it involves setting up Rust/Axum from scratch (learning curve), hand-rolling auth (magic links, sessions, captcha), and standing up the SolidJS frontend with routing and auth context. Everything depends on this.
Phase 5 -- Generation Pipeline + SSE: This is the core value proposition of the application. Without synthesis generation, the app is a settings manager.
Phase 4 -- LLM Provider Abstraction (Gemini): Prerequisite for Phase 5. Getting one provider working end-to-end with structured output and web search grounding proves the LLM integration works.
Phase 2 -- Sources CRUD + Scraper: Sources feed the generation pipeline. The scraper is used during generation to validate URLs.
Phase 3 -- Admin Module: Users cannot configure providers/models without this. However, for a single-user self-hosted scenario, environment variables or seed data could serve as a temporary workaround.
Phase 6 -- Multi-Provider: Adds OpenAI and Anthropic. High value but the app is fully functional with Gemini alone.
Phase 7 -- Email + Export: Nice-to-have features. The app works without them. Users can copy-paste or screenshot.

Phase 1: Foundation

Goal

Prove the entire stack works end-to-end: Rust serving a SolidJS SPA, Postgres connected, email + magic link authentication functioning, and one complete CRUD flow (user settings).

Deliverables

Backend (Rust/Axum):

Project scaffold: Cargo workspace, main.rs, config loading (dotenvy), tracing/logging setup
Postgres connection pool (sqlx) with compile-time checked queries
Database migrations: users, sessions, magic_link_tokens, settings tables
Unified error handling (AppError enum with IntoResponse)
Auth system:
- POST /api/v1/auth/register -- email + Cloudflare Turnstile captcha validation, sends magic link via Resend
- POST /api/v1/auth/login -- request magic link (same response whether email exists or not)
- GET /api/v1/auth/verify?token=... -- verify token, create session, set cookie, redirect to app
- POST /api/v1/auth/logout -- invalidate session, clear cookie
- GET /api/v1/auth/me -- return current user info
Session middleware: cookie extraction, SHA-256 lookup, expiration check (30-day), user injection into request extensions
CSRF protection: X-Requested-With header check on mutating requests + SameSite=Lax cookies
Rate limiting on auth endpoints (per-IP, per-email)
Settings CRUD: GET /api/v1/settings, PUT /api/v1/settings
CLI command: create-admin to bootstrap the first admin account
Static file serving: serve the SolidJS build output from the Axum binary
Security headers (CSP, X-Content-Type-Options, X-Frame-Options, HSTS, Referrer-Policy)
CORS configuration

Frontend (SolidJS):

Vite + SolidJS + TypeScript + Tailwind CSS project scaffold
Auth context with signals (session check via GET /api/v1/auth/me on load)
Route guard (redirect to /login if unauthenticated)
Login page: email field, Turnstile widget, "Recevoir un lien de connexion" button, "Creer un compte" link
Sign-up page: email + optional display name + Turnstile + "Creer mon compte" button
Magic link confirmation screen ("Verifiez votre boite de reception", resend with cooldown)
Navbar: logo, nav links (Syntheses, Sources), user email, Settings gear, Logout
Mobile hamburger menu
Active route indicator on nav links
Settings page: theme, max age days, categories (dynamic list with add/remove), max items per category, search agent behavior, AI model dropdown (hardcoded placeholder for now)
Error boundary (top-level ErrorBoundary component)
Session expiry handling (401 -> redirect to login with message)
i18n-ready structure: all user-facing strings in a central fr.ts locale file, accessed via a helper function

Infrastructure:

Dockerfile (multi-stage: Rust build + SolidJS build -> minimal runtime image)
docker-compose.yml with Postgres service + app service
.env.example with all required environment variables documented
Resend integration for sending magic link emails

CLI:

./ai-synth create-admin admin@example.com creates an admin user (no magic link needed, account is pre-verified)

Definition of Done

docker compose up starts the app with Postgres
A new user can sign up (receives magic link email via Resend), click the link, and land on the home page
The admin can be created via CLI
Authenticated user can view and update their settings
Unauthenticated requests return 401
Session persists across browser restarts (within 30-day window)
Logout invalidates the session server-side
Turnstile captcha prevents automated signups
All pages render correctly on desktop and mobile

Dependencies

None -- this is the first phase.

Risk Factors

Hand-rolled auth is the highest-risk component. Magic link token lifecycle (generation, SHA-256 hashing, single-use enforcement, expiration, email enumeration prevention) must be implemented correctly from the start. A subtle bug here creates a security vulnerability.
Rust learning curve. Async Rust with Axum, sqlx, and tower middleware is non-trivial for someone learning Rust. Expect the borrow checker, lifetime annotations, and trait bounds to slow things down significantly in this phase.
Email deliverability. Magic links only work if emails arrive. Resend handles SPF/DKIM/DMARC, but initial setup, domain verification, and inbox placement testing can take time.
Cloudflare Turnstile integration. Requires server-side verification of the captcha token. The API is simple, but handling failures (network issues, invalid tokens, expired tokens) needs careful error UX.

Testing Scope

Unit tests: Config parsing, session token generation/hashing, magic link token lifecycle, CSRF header validation, settings validation (serde + validator), AppError response formatting
Integration tests: Full auth flow (register -> verify -> me -> logout), settings CRUD with auth, admin CLI command, rate limiting on auth endpoints (verify lockout after N failures), 401 on unauthenticated access
Frontend: Manual smoke testing of all screens and flows (automated E2E testing deferred to a later phase when there is more to test)

Milestones

M1.1 -- Rust skeleton compiles and serves "Hello World": Axum router, tracing, config loading, Postgres pool connected, migrations run. Docker build works.
M1.2 -- Auth flow works end-to-end: Register, magic link email sent (Resend), verify token, session cookie set, GET /me returns user, logout clears session. CLI create-admin works.
M1.3 -- SolidJS shell renders: Login page, navbar, settings page (static, no API calls yet). Tailwind styling matches the current app's visual language. Mobile hamburger menu works.
M1.4 -- Frontend + backend integrated: SolidJS auth context calls the API. Login/signup flow works through the UI. Settings page reads/writes via the API. Session expiry redirects to login.
M1.5 -- Tests and hardening: Unit and integration tests pass. Security headers configured. CSRF protection tested. Rate limiting on auth endpoints verified.

Phase 2: Sources CRUD + Scraper Service

Goal

Add the custom sources management feature (CRUD, bulk import, CSV import/export) and build the URL scraper service that will be used during synthesis generation.

Deliverables

Backend:

Database migration: sources table
Sources API:
- GET /api/v1/sources -- list user's sources
- POST /api/v1/sources -- add a single source (title + URL, validated)
- DELETE /api/v1/sources/:id -- delete a source (ownership check)
- POST /api/v1/sources/bulk -- bulk import (JSON array)
- POST /api/v1/sources/import-csv -- CSV import (multipart upload)
- GET /api/v1/sources/export-csv -- CSV export download
Input validation: URL format validation, title length limits, max sources per user
Scraper service (services/scraper.rs):
- reqwest HTTP client (shared from AppState, with timeouts: 5s connect, 15s response, 30s total)
- SSRF prevention: DNS resolution check against private IP ranges, protocol restriction (http/https only), redirect validation
- HTML parsing with scraper crate: soft-404 detection, publication date extraction (meta tags, JSON-LD, <time> elements), body text extraction (max 4000 chars, strip scripts/nav/footer)
- Custom User-Agent header

Frontend:

Sources page:
- List view with title and URL for each source
- Add form: title + URL fields with validation feedback
- Delete with standardized confirmation dialog (same pattern as settings)
- Bulk import via textarea
- CSV import (file picker) and export (download button)
Empty state with onboarding hint

Definition of Done

User can add, view, and delete custom sources
Bulk import (JSON and CSV) works correctly
CSV export downloads a valid file
URL validation rejects malformed URLs
Ownership isolation: user A cannot see or delete user B's sources
Scraper service can fetch a URL, parse HTML, detect soft-404s, extract publication dates, and extract body text
SSRF protection rejects requests to private/internal IPs

Dependencies

Phase 1 (auth, DB pool, frontend scaffolding, Docker setup)

Risk Factors

Scraper robustness. Real-world HTML is messy. Publication date extraction from meta tags, JSON-LD, and <time> elements covers many sites but not all. Expect edge cases.
SSRF prevention correctness. DNS rebinding attacks can bypass naive IP checks. The implementation must resolve DNS and check the IP before connecting, and re-check on redirects.
CSV parsing. Malformed CSV files, encoding issues (UTF-8 BOM, Windows line endings), and large files can cause problems.

Testing Scope

Unit tests: URL validation, SSRF IP range checks, HTML parsing (soft-404 detection, date extraction, body text extraction -- use fixture HTML files), CSV parsing/generation
Integration tests: Full CRUD lifecycle for sources (create, list, delete), bulk import, CSV import/export, ownership isolation (user A cannot access user B's sources), SSRF rejection for private IPs

Milestones

M2.1 -- Sources CRUD API complete: All endpoints working with auth. Integration tests pass.
M2.2 -- Scraper service complete: Fetches, parses, validates URLs. SSRF protection in place. Unit tests with fixture HTML pass.
M2.3 -- Frontend sources page complete: All interactions working, CSV import/export, empty state.

Phase 3: Admin Module

Goal

Build the admin interface for curating LLM providers/models and configuring rate limits. This phase establishes the provider/model catalog that users will select from in their settings.

Deliverables

Backend:

Database migrations: llm_providers table (provider name, display name, models JSON array, is_enabled, created_at, updated_at), rate_limits table (per-provider limits)
- Note: In the decisions doc, users bring their own API keys. The llm_providers table here stores the admin-curated list of available providers and models, NOT API keys. User API keys are stored separately (see Phase 4).
Admin API (all require admin role):
- GET /api/v1/admin/providers -- list all provider configs
- POST /api/v1/admin/providers -- add/update a provider config (provider name, display name, list of enabled models)
- DELETE /api/v1/admin/providers/:id -- remove a provider
- GET /api/v1/admin/rate-limits -- get rate limit configs
- PUT /api/v1/admin/rate-limits/:provider_id -- update rate limit config
- GET /api/v1/admin/users -- list all users
- PUT /api/v1/admin/users/:id/role -- change user role
Public endpoint (authenticated, non-admin):
- GET /api/v1/config/providers -- list enabled providers and their model names (no sensitive data)
RequireAdmin middleware layer (checks user.role == "admin", returns 403 otherwise)
Rate limiter service: in-memory token-bucket per provider (using DashMap), configurable from admin, hot-reload on config change
Audit logging table and writes for admin actions

Frontend:

Admin layout at /admin (separate route prefix, sidebar navigation)
Admin nav: visible only to admin users (hidden from DOM for non-admins)
Provider configuration page (/admin/providers):
- Card/tab per provider (Gemini, OpenAI, Anthropic)
- Enable/disable toggle per provider
- Model list management (checkboxes to enable/disable specific models)
- Default model selection (dropdown)
- Status indicators (configured/not configured)
Rate limit configuration page (/admin/rate-limits):
- Per-provider rate limit fields (requests per minute)
- Global limits
- Save button
User management page (/admin/users):
- User list with email, role, creation date
- Role change (promote/demote admin)

Settings page update:

Replace the hardcoded AI model dropdown with a dynamic two-level selection:
- Provider dropdown (populated from GET /api/v1/config/providers)
- Model dropdown (populated based on selected provider)
If only one provider is configured, hide the provider dropdown

Definition of Done

Admin can add, configure, enable/disable providers and models
Admin can configure per-provider rate limits
Admin can view user list and change roles
Non-admin users cannot access admin pages (403 from API, routes hidden in UI)
GET /api/v1/config/providers returns the list of enabled providers and models
Settings page dynamically populates provider/model dropdowns from the admin config
Rate limiter enforces configured limits
Audit log records all admin actions

Dependencies

Phase 1 (auth with admin role, DB, frontend scaffolding)

Risk Factors

Rate limiter complexity. In-memory state with hot-reload from DB requires careful concurrency handling (DashMap + atomic operations). Edge cases around config reload while requests are in flight.
Admin UX complexity. The provider configuration page has many interacting elements (enable/disable, model list, default model). Getting the UX right takes iteration.
Role-based access control. Must be watertight -- every admin endpoint must be protected both in the frontend (route guard) and backend (middleware). A missed check is a privilege escalation vulnerability.

Testing Scope

Unit tests: Rate limiter (token bucket logic, config reload), admin role check middleware
Integration tests: Admin CRUD for providers, rate limits. Non-admin access rejection (403). Role change. Audit log entries created. Public config endpoint returns correct data. Settings page provider/model population.

Milestones

M3.1 -- Admin API complete: All admin endpoints working with role protection. Integration tests pass.
M3.2 -- Rate limiter service: In-memory rate limiter with DB-backed config. Hot-reload tested.
M3.3 -- Admin frontend complete: All admin pages functional. Non-admin users see no admin UI.
M3.4 -- Settings page updated: Dynamic provider/model selection working.

Phase 4: LLM Provider Abstraction (Gemini First)

Goal

Implement the LLM provider trait and the first concrete implementation (Google Gemini), including user API key management. Prove that the abstraction works for structured output and web search grounding.

Deliverables

Backend:

Database migration: user_api_keys table (user_id, provider, encrypted_key using AES-256-GCM, nonce, key_prefix for display, created_at, updated_at)
API key encryption service:
- Master key from MASTER_KEY_SECRET environment variable
- AES-256-GCM encryption/decryption using aes-gcm crate
- Per-key unique nonce via OsRng
- Keys decrypted in memory only when making LLM calls, dropped immediately after
- secrecy + zeroize crates for sensitive value handling
User API key endpoints:
- GET /api/v1/user/api-keys -- list user's keys (provider + key_prefix only, never the full key)
- POST /api/v1/user/api-keys -- add/update an API key for a provider
- DELETE /api/v1/user/api-keys/:provider -- remove a key
- POST /api/v1/user/api-keys/:provider/test -- test the key with a minimal LLM call

LlmProvider trait:

provider_id() -> &str
generate_search_pass(model, system_prompt, user_prompt, response_schema) -> Result<Value>
generate_rewrite_pass(model, system_prompt, user_prompt, response_schema) -> Result<Value>

GeminiProvider implementation:
- generateContent API call with googleSearch tool for Pass 1
- Structured output via responseSchema + responseMimeType: "application/json"
- Standard generation (no tools) for Pass 2
- Dynamic category schema construction from user settings
Provider factory function: creates the correct provider implementation from config + user's decrypted API key

Frontend:

User API key management in Settings page:
- Per-provider section showing key status (configured/not configured, key prefix)
- Add/update key input (masked, with show/hide toggle)
- Test button per provider (calls test endpoint, shows success/failure)
- Delete key button
Warning display when a provider does not support web search grounding

Definition of Done

User can add, test, and remove their Gemini API key
API keys are encrypted at rest (AES-256-GCM) and never returned in full via the API
The LlmProvider trait is defined and GeminiProvider passes a manual test:
- Pass 1: structured search results with googleSearch grounding
- Pass 2: rewrite with structured output
Test endpoint validates the key works
Provider factory correctly creates a GeminiProvider from config + user key

Dependencies

Phase 2 (scraper service -- used in the pipeline validation)
Phase 3 (admin-curated provider/model list -- the factory reads from this)

Risk Factors

Gemini API specifics. The responseSchema for structured JSON output + googleSearch tool configuration via the REST API (not a Rust SDK) requires careful request construction. Gemini's API versions and response formats can change.
Encryption correctness. AES-256-GCM with per-key nonces must be implemented correctly. A nonce reuse with the same key breaks GCM security entirely. Using OsRng for nonce generation mitigates this.
Structured output parsing. The dynamic schema (generated from user categories) must produce valid JSON Schema that Gemini accepts. Edge cases in category names (special characters, very long names) can break schema generation.
API key security. The full lifecycle (transmit over HTTPS, encrypt at rest, decrypt in memory, drop after use) has multiple points where a mistake could leak keys (logging, error messages, debug output).

Testing Scope

Unit tests: AES-256-GCM encryption round-trip, key prefix extraction, dynamic schema generation from categories, provider factory (mocked), Gemini request/response serialization
Integration tests: User API key CRUD (verify encryption at rest, verify key is never returned in full), test endpoint (with a mock HTTP server standing in for Gemini), provider trait contract tests (mock implementation)
Manual test: End-to-end Gemini call with a real API key (not in CI, developer-run)

Milestones

M4.1 -- User API key management: CRUD endpoints working, encryption at rest verified, frontend key management in Settings.
M4.2 -- LlmProvider trait defined: Trait, types, factory function. Mock implementation for testing.
M4.3 -- GeminiProvider working: Both passes (search + rewrite) produce valid structured output. Manual test with real API key succeeds.
M4.4 -- Dynamic schema generation: Category-based schema construction tested with various category configurations.

Phase 5: Generation Pipeline + SSE Progress

Goal

Wire everything together into the full synthesis generation pipeline: user triggers generation, backend runs the two-pass pipeline (search -> scrape/validate -> rewrite), sends real-time progress via SSE, and saves the result. The user can view the synthesis.

Deliverables

Backend:

Database migration: syntheses table (user_id, week, sections JSON, status, created_at), generation_jobs table (ephemeral, or in-memory DashMap)
Generation pipeline orchestration (services/synthesis.rs):
1. Load user settings and sources
2. Resolve provider + model (from user's settings + admin config + user's API key)
3. Build dynamic schema from categories
4. Rate limit check (acquire slot)
5. Pass 1: Search (via LlmProvider::generate_search_pass)
6. Validate and scrape URLs (via scraper service, with SSRF protection)
7. Rate limit check (acquire slot for Pass 2)
8. Pass 2: Rewrite (via LlmProvider::generate_rewrite_pass with scraped content)
9. Parse and validate structured output
10. Save synthesis to database
Async generation API:
- POST /api/v1/syntheses/generate -- triggers generation, returns immediately with job_id (202 Accepted)
- GET /api/v1/syntheses/generate/:job_id/progress -- SSE endpoint streaming progress events
SSE progress events:
- { step: "search", message: "Recherche d'actualites en cours...", percent: 10 }
- { step: "scraping", message: "Verification des sources (3/12)...", percent: 40 }
- { step: "rewrite", message: "Redaction des resumes...", percent: 75 }
- { step: "saving", message: "Sauvegarde...", percent: 95 }
- complete event with synthesis_id
- error event with message
Syntheses API:
- GET /api/v1/syntheses -- list user's syntheses (paginated, sorted by created_at desc)
- GET /api/v1/syntheses/:id -- get synthesis detail
- DELETE /api/v1/syntheses/:id -- delete a synthesis (ownership check, confirmation handled by frontend)
Job state management: in-memory DashMap<String, JobStatus> with TTL cleanup (jobs expire after 1 hour)
Prompt construction: system prompt and user prompt templates built from user settings (theme, categories, max age, search agent behavior, custom sources)

Frontend:

Home page (Dashboard):
- Grid of synthesis cards (responsive: 1/2/3 columns)
- Each card: week badge, creation date, preview of first section items (line-clamped)
- Footer: "Lire la synthese" link, delete button with confirmation dialog
- Empty state with onboarding hint
- Banner when a generation is in progress ("Une generation est en cours...")
Generate page:
- Confirmation text showing theme, age window, provider, model
- "Lancer la generation" button
- Progress bar with step descriptions (SSE-driven)
- Step checklist (done/in-progress/pending)
- "Vous pouvez quitter cette page" note
- Error display with retry option
- Auto-redirect to synthesis detail on completion
Synthesis detail page:
- Section-by-section display: section title, then cards for each news item (title as external link, summary paragraph)
- Back navigation
- Delete button with confirmation dialog
SSE client: EventSource connection management, reconnection on disconnect, state synchronization if user navigates away and returns

Definition of Done

User clicks "Lancer la generation" and sees real-time progress via SSE
Generation runs asynchronously -- user can navigate away and return
Home page shows an in-progress banner during generation
On completion, the synthesis is saved and viewable
Synthesis detail shows all sections with items, titles as links, and summaries
User can delete syntheses
Ownership isolation: user A cannot view or delete user B's syntheses
Generation failures display an error message with context
Rate limiting prevents excessive generation requests

Dependencies

Phase 4 (LLM provider working with Gemini)

Risk Factors

Pipeline reliability. The two-pass pipeline with scraping in between is complex. Failures at any stage (LLM timeout, scraping failure, invalid structured output) must be handled gracefully. Partial results (some URLs fail to scrape) should not abort the entire generation.
SSE connection management. SSE connections can be dropped by reverse proxies, load balancers, or browser timeouts. The frontend must handle reconnection and state recovery. The backend must not leak resources (orphaned SSE connections, zombie tokio tasks).
Structured output parsing. LLMs occasionally produce malformed JSON even with schema constraints. The pipeline must handle parsing failures gracefully (retry once, or fall back to best-effort extraction).
Generation duration. End-to-end generation (2 LLM calls + N URL scrapes) can take 30-90+ seconds. The async model handles this, but progress reporting must be accurate (not fake percentages).
Concurrent generation. What happens if a user triggers a second generation while one is running? Decision needed: reject with "already in progress" or queue.

Testing Scope

Unit tests: Prompt construction (from settings + sources), structured output parsing (valid and malformed JSON), job status management, SSE event serialization
Integration tests: Full generation pipeline with mocked LLM provider (returns canned structured output) and mocked scraper (returns canned HTML). Verify: correct DB state after generation, SSE events sequence, error handling (LLM failure, scraper failure). Syntheses CRUD with ownership isolation.
E2E test: Manual test with real Gemini API key. Full flow: configure settings, add sources, generate, view result.

Milestones

M5.1 -- Syntheses CRUD: List, get, delete endpoints and frontend pages. Works with manually inserted test data.
M5.2 -- Pipeline orchestration: Full two-pass pipeline runs synchronously (no SSE yet) with mocked LLM. Saves result to DB.
M5.3 -- SSE progress: Async generation with SSE streaming. Frontend displays progress bar and step checklist.
M5.4 -- Home page integration: In-progress banner, auto-refresh on completion, empty state.
M5.5 -- End-to-end with real LLM: Manual test with Gemini. Prompt tuning. Error handling hardened.

Phase 6: Multi-Provider (OpenAI + Anthropic)

Goal

Add OpenAI and Anthropic as LLM providers, implementing the LlmProvider trait for each with their respective web search and structured output capabilities. The generation pipeline adapts per provider.

Deliverables

Backend:

OpenAiProvider implementation:
- Pass 1: Uses OpenAI Responses API with web_search tool for grounded search results. Structured output via response_format: { type: "json_schema", json_schema: ... }.
- Pass 2: Standard chat completion with structured JSON output.
- Model mapping: validate user-selected model against admin-enabled models.
AnthropicProvider implementation:
- Pass 1: Uses Claude's web_search tool for grounded results. Structured output via tool-use pattern (define a tool whose input schema matches the desired output, instruct Claude to call it).
- Pass 2: Standard message with JSON output instructions. Server-side parsing and validation (Anthropic does not have native JSON schema enforcement as robust as Gemini/OpenAI).
Pipeline adaptation per provider:
- Decision logic: if native web search grounding produces high-quality results (detected by checking citation count, URL validity), skip the scrape/rewrite pass.
- If not, fall back to the full two-pass pipeline.
- Provider-specific prompt adjustments (each provider responds differently to the same prompt structure).
Error handling per provider: different error codes, rate limit headers, and retry semantics for each provider's API.

Frontend:

Settings page: provider dropdown now populated with all admin-enabled providers (Gemini, OpenAI, Anthropic)
Generate page: warning when selected provider has limited web search capabilities
Provider-specific info text in settings ("La recherche web en temps reel est disponible avec ce fournisseur" vs "Les resultats seront bases sur les connaissances du modele")

Definition of Done

User can select OpenAI or Anthropic as their provider, add their API key, and generate a synthesis
Structured output is correctly parsed for all three providers
Web search grounding works for all three providers (using their respective tools)
Pipeline adapts per provider (skip scrape pass when native grounding is sufficient)
Error handling is provider-specific (correct error messages for quota exceeded, invalid key, model not available, etc.)
All existing Gemini functionality continues to work unchanged

Dependencies

Phase 5 (generation pipeline working end-to-end with Gemini)

Risk Factors

Provider API differences are deeper than they appear. Each provider's web search tool returns results in different formats with different citation structures. The abstraction must handle this without becoming a leaky mess.
Anthropic structured output. Claude does not have Gemini/OpenAI's level of JSON schema enforcement. The backend must handle parsing failures and potentially retry with clearer instructions.
Testing across providers. Each provider requires a real API key for meaningful testing. Mocks can only go so far -- real provider behavior (latency, rate limits, output quality) varies.
Pipeline adaptation heuristic. Deciding when to skip the scrape/rewrite pass is a quality judgment. Too aggressive skipping produces lower-quality summaries. Too conservative means the pipeline is always slow.

Testing Scope

Unit tests: OpenAI and Anthropic request/response serialization, provider-specific error mapping, pipeline adaptation logic
Integration tests: Full pipeline with mocked OpenAI and Anthropic HTTP responses. Verify structured output parsing, error handling, pipeline adaptation.
Manual tests: End-to-end generation with real OpenAI and Anthropic API keys. Quality comparison across providers.

Milestones

M6.1 -- OpenAiProvider working: Both passes produce valid structured output. Manual test with real key.
M6.2 -- AnthropicProvider working: Both passes produce valid structured output. Structured output parsing handles edge cases. Manual test with real key.
M6.3 -- Pipeline adaptation: Provider-specific behavior (skip scrape when appropriate) implemented and tested.
M6.4 -- Frontend updated: Provider selection, warnings, and info text. All three providers selectable.

Phase 7: Email (Resend) + Export (PDF/Markdown)

Goal

Add the ability to send a synthesis by email (via Resend) and export it as PDF or Markdown.

Deliverables

Backend:

Email sending service (services/email.rs -- extends the existing magic link email service):
- POST /api/v1/syntheses/:id/send-email -- send synthesis to a specified email address
- HTML email template: renders the synthesis (sections, items, links) as a formatted email
- Plain-text fallback
- Sender address configured via environment variable
- Default recipient: the user's own email
Export service:
- GET /api/v1/syntheses/:id/export/markdown -- returns the synthesis as a Markdown file download
- GET /api/v1/syntheses/:id/export/pdf -- returns the synthesis as a PDF file download
- Markdown generation: convert sections/items to Markdown format (headers, bullet points, links)
- PDF generation: use a Rust PDF library (e.g., printpdf or genpdf, or convert Markdown to PDF via pulldown-cmark + a PDF renderer)

Frontend:

Synthesis detail page additions:
- "Envoyer par email" button with email input (pre-filled with user's email)
- "S'envoyer a soi-meme" quick button
- Export dropdown: "Exporter en Markdown" and "Exporter en PDF"
- Loading states and success/error feedback for email and export actions

Definition of Done

User can send a synthesis by email to any address (via Resend)
Email is well-formatted HTML with plain-text fallback
User can export a synthesis as Markdown (downloads a .md file)
User can export a synthesis as PDF (downloads a .pdf file)
Default email recipient is the user's own email
Ownership check: user can only email/export their own syntheses

Dependencies

Phase 1 (Resend integration already exists for magic links, auth). Best done after Phase 5 (so there is actual content to email/export).

Risk Factors

PDF generation in Rust. The Rust PDF ecosystem is less mature than in other languages. genpdf is the most ergonomic option but has limited styling control. printpdf is low-level. Consider generating HTML and using a headless browser or wkhtmltopdf as a last resort (adds a Docker dependency).
Email formatting. HTML emails are notoriously difficult to render consistently across email clients. Keep the template simple (tables-based layout, inline CSS, no external resources).
Resend rate limits. The free tier has limits. Bulk email sending (e.g., user sends to a mailing list) could hit these. Rate limit email sends per user.

Testing Scope

Unit tests: Markdown generation from synthesis data, HTML email template rendering, PDF generation (verify output is valid PDF)
Integration tests: Email endpoint (mock Resend API), export endpoints (verify correct Content-Type and Content-Disposition headers, verify file content), ownership isolation

Milestones

M7.1 -- Email sending: Backend endpoint working, HTML template rendered, Resend integration tested.
M7.2 -- Markdown export: Endpoint returns correctly formatted .md file.
M7.3 -- PDF export: Endpoint returns a valid PDF.
M7.4 -- Frontend integration: Email and export buttons on synthesis detail page, with loading states and feedback.

Summary Table

Phase	Name	Core Capability	Key Risk	Estimated Relative Effort
1	Foundation	Stack proof, auth, settings CRUD	Rust learning curve + hand-rolled auth	Very Large
2	Sources + Scraper	Sources CRUD, URL scraping	Scraper robustness, SSRF	Medium
3	Admin Module	Provider/model curation, rate limits	Rate limiter concurrency, RBAC	Medium
4	LLM Abstraction (Gemini)	Provider trait, encryption, Gemini impl	Structured output, API key security	Large
5	Generation Pipeline + SSE	End-to-end synthesis generation	Pipeline reliability, SSE management	Very Large
6	Multi-Provider	OpenAI + Anthropic	Provider API differences, quality	Large
7	Email + Export	Resend email, PDF/Markdown export	PDF generation in Rust	Small-Medium

Cross-Phase Concerns

These items are not isolated to a single phase but evolve incrementally:

Testing Strategy

Phase 1: Unit tests for core utilities, integration tests for auth flow. Establish CI pipeline (cargo test + clippy + cargo audit).
Phase 2-3: Integration tests grow with each CRUD module. Introduce fixture-based testing for scraper.
Phase 4: Add mock-based testing for external API calls. Provider trait contract tests.
Phase 5: First E2E tests (pipeline with mocked externals). SSE client testing.
Phase 6: Expand provider mocks. Cross-provider output comparison.
Phase 7: Template rendering tests. Full E2E manual test of the complete application.

Security Hardening

Phase 1: Auth, CSRF, CSP, security headers, rate limiting on auth endpoints, session security.
Phase 2: SSRF prevention in scraper.
Phase 3: RBAC, audit logging.
Phase 4: API key encryption at rest, secret handling (secrecy + zeroize).
Phase 5: Input sanitization for prompt injection (max lengths, delimiter patterns).
Phase 6: Per-provider error handling (avoid leaking provider API details to users).
Phase 7: Email input validation (prevent header injection).

i18n Readiness

All phases: user-facing strings go through the locale file (fr.ts). No hardcoded French strings in component logic.
Phase 1 establishes the pattern. Subsequent phases follow it.

Docker and Deployment

Phase 1 establishes the Dockerfile and docker-compose.yml.
Subsequent phases only add environment variables (documented in .env.example).
No deployment changes required between phases -- the same docker compose up works throughout.

38 KiB Raw Blame History

Phased Delivery Roadmap: AI Weekly Synth Rewrite

Overview

Dependency Graph

Risk-Ordered Priority

Phase 1: Foundation

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 2: Sources CRUD + Scraper Service

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 3: Admin Module

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 4: LLM Provider Abstraction (Gemini First)

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 5: Generation Pipeline + SSE Progress

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 6: Multi-Provider (OpenAI + Anthropic)

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Phase 7: Email (Resend) + Export (PDF/Markdown)

Goal

Deliverables

Definition of Done

Dependencies

Risk Factors

Testing Scope

Milestones

Summary Table

Cross-Phase Concerns

Testing Strategy

Security Hardening

i18n Readiness

Docker and Deployment

38 KiB

Raw Blame History