You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

38 KiB

Phased Delivery Roadmap: AI Weekly Synth Rewrite

Date: 2026-03-21 Author: Phase Roadmap Planner Input: Team analysis (01-04) and project decisions (05)


Overview

This roadmap decomposes the AI Weekly Synth rewrite into 7 phases. Each phase produces a working, deployable application. The phases are ordered to deliver value incrementally: Phase 1 proves the stack works end-to-end, and each subsequent phase adds exactly one major capability.

The decisions document establishes: Rust (Axum) + Postgres + SolidJS + Tailwind CSS, all 3 LLM providers, user-provided API keys, email + magic link auth, Docker-only deployment, no data migration, and testing as part of the plan.


Dependency Graph

Phase 1: Foundation (Axum + Postgres + SolidJS + Auth + Settings CRUD)
   |
   +---> Phase 2: Sources CRUD + Scraper Service
   |        |
   |        +---> Phase 4: LLM Provider Abstraction (Gemini first)
   |                 |
   |                 +---> Phase 5: Generation Pipeline + SSE Progress
   |                 |        |
   |                 |        +---> Phase 6: Multi-Provider (OpenAI + Anthropic)
   |                 |
   +---> Phase 3: Admin Module (Provider/Model Curation, Rate Limits)
   |        |
   |        +---> Phase 4 (admin curates provider list that Phase 4 uses)
   |
   +-------------------------------+---> Phase 7: Email (Resend) + Export (PDF/Markdown)

Summary of dependencies:

  • Phase 2 depends on Phase 1 (auth, DB, frontend scaffolding)
  • Phase 3 depends on Phase 1 (auth with admin role, DB)
  • Phase 4 depends on Phase 2 (scraper) + Phase 3 (admin-curated provider/model list)
  • Phase 5 depends on Phase 4 (LLM provider working)
  • Phase 6 depends on Phase 5 (pipeline working with one provider)
  • Phase 7 depends on Phase 1 only (can be started after Phase 1, but best done after Phase 5 so there is content to email/export)

Risk-Ordered Priority

If time runs out, this is the order of criticality (most critical first):

  1. Phase 1 -- Foundation: Without this, nothing works. This is the riskiest phase because it involves setting up Rust/Axum from scratch (learning curve), hand-rolling auth (magic links, sessions, captcha), and standing up the SolidJS frontend with routing and auth context. Everything depends on this.

  2. Phase 5 -- Generation Pipeline + SSE: This is the core value proposition of the application. Without synthesis generation, the app is a settings manager.

  3. Phase 4 -- LLM Provider Abstraction (Gemini): Prerequisite for Phase 5. Getting one provider working end-to-end with structured output and web search grounding proves the LLM integration works.

  4. Phase 2 -- Sources CRUD + Scraper: Sources feed the generation pipeline. The scraper is used during generation to validate URLs.

  5. Phase 3 -- Admin Module: Users cannot configure providers/models without this. However, for a single-user self-hosted scenario, environment variables or seed data could serve as a temporary workaround.

  6. Phase 6 -- Multi-Provider: Adds OpenAI and Anthropic. High value but the app is fully functional with Gemini alone.

  7. Phase 7 -- Email + Export: Nice-to-have features. The app works without them. Users can copy-paste or screenshot.


Phase 1: Foundation

Goal

Prove the entire stack works end-to-end: Rust serving a SolidJS SPA, Postgres connected, email + magic link authentication functioning, and one complete CRUD flow (user settings).

Deliverables

Backend (Rust/Axum):

  • Project scaffold: Cargo workspace, main.rs, config loading (dotenvy), tracing/logging setup
  • Postgres connection pool (sqlx) with compile-time checked queries
  • Database migrations: users, sessions, magic_link_tokens, settings tables
  • Unified error handling (AppError enum with IntoResponse)
  • Auth system:
    • POST /api/v1/auth/register -- email + Cloudflare Turnstile captcha validation, sends magic link via Resend
    • POST /api/v1/auth/login -- request magic link (same response whether email exists or not)
    • GET /api/v1/auth/verify?token=... -- verify token, create session, set cookie, redirect to app
    • POST /api/v1/auth/logout -- invalidate session, clear cookie
    • GET /api/v1/auth/me -- return current user info
  • Session middleware: cookie extraction, SHA-256 lookup, expiration check (30-day), user injection into request extensions
  • CSRF protection: X-Requested-With header check on mutating requests + SameSite=Lax cookies
  • Rate limiting on auth endpoints (per-IP, per-email)
  • Settings CRUD: GET /api/v1/settings, PUT /api/v1/settings
  • CLI command: create-admin to bootstrap the first admin account
  • Static file serving: serve the SolidJS build output from the Axum binary
  • Security headers (CSP, X-Content-Type-Options, X-Frame-Options, HSTS, Referrer-Policy)
  • CORS configuration

Frontend (SolidJS):

  • Vite + SolidJS + TypeScript + Tailwind CSS project scaffold
  • Auth context with signals (session check via GET /api/v1/auth/me on load)
  • Route guard (redirect to /login if unauthenticated)
  • Login page: email field, Turnstile widget, "Recevoir un lien de connexion" button, "Creer un compte" link
  • Sign-up page: email + optional display name + Turnstile + "Creer mon compte" button
  • Magic link confirmation screen ("Verifiez votre boite de reception", resend with cooldown)
  • Navbar: logo, nav links (Syntheses, Sources), user email, Settings gear, Logout
  • Mobile hamburger menu
  • Active route indicator on nav links
  • Settings page: theme, max age days, categories (dynamic list with add/remove), max items per category, search agent behavior, AI model dropdown (hardcoded placeholder for now)
  • Error boundary (top-level ErrorBoundary component)
  • Session expiry handling (401 -> redirect to login with message)
  • i18n-ready structure: all user-facing strings in a central fr.ts locale file, accessed via a helper function

Infrastructure:

  • Dockerfile (multi-stage: Rust build + SolidJS build -> minimal runtime image)
  • docker-compose.yml with Postgres service + app service
  • .env.example with all required environment variables documented
  • Resend integration for sending magic link emails

CLI:

  • ./ai-synth create-admin admin@example.com creates an admin user (no magic link needed, account is pre-verified)

Definition of Done

  • docker compose up starts the app with Postgres
  • A new user can sign up (receives magic link email via Resend), click the link, and land on the home page
  • The admin can be created via CLI
  • Authenticated user can view and update their settings
  • Unauthenticated requests return 401
  • Session persists across browser restarts (within 30-day window)
  • Logout invalidates the session server-side
  • Turnstile captcha prevents automated signups
  • All pages render correctly on desktop and mobile

Dependencies

None -- this is the first phase.

Risk Factors

  1. Hand-rolled auth is the highest-risk component. Magic link token lifecycle (generation, SHA-256 hashing, single-use enforcement, expiration, email enumeration prevention) must be implemented correctly from the start. A subtle bug here creates a security vulnerability.
  2. Rust learning curve. Async Rust with Axum, sqlx, and tower middleware is non-trivial for someone learning Rust. Expect the borrow checker, lifetime annotations, and trait bounds to slow things down significantly in this phase.
  3. Email deliverability. Magic links only work if emails arrive. Resend handles SPF/DKIM/DMARC, but initial setup, domain verification, and inbox placement testing can take time.
  4. Cloudflare Turnstile integration. Requires server-side verification of the captcha token. The API is simple, but handling failures (network issues, invalid tokens, expired tokens) needs careful error UX.

Testing Scope

  • Unit tests: Config parsing, session token generation/hashing, magic link token lifecycle, CSRF header validation, settings validation (serde + validator), AppError response formatting
  • Integration tests: Full auth flow (register -> verify -> me -> logout), settings CRUD with auth, admin CLI command, rate limiting on auth endpoints (verify lockout after N failures), 401 on unauthenticated access
  • Frontend: Manual smoke testing of all screens and flows (automated E2E testing deferred to a later phase when there is more to test)

Milestones

  1. M1.1 -- Rust skeleton compiles and serves "Hello World": Axum router, tracing, config loading, Postgres pool connected, migrations run. Docker build works.
  2. M1.2 -- Auth flow works end-to-end: Register, magic link email sent (Resend), verify token, session cookie set, GET /me returns user, logout clears session. CLI create-admin works.
  3. M1.3 -- SolidJS shell renders: Login page, navbar, settings page (static, no API calls yet). Tailwind styling matches the current app's visual language. Mobile hamburger menu works.
  4. M1.4 -- Frontend + backend integrated: SolidJS auth context calls the API. Login/signup flow works through the UI. Settings page reads/writes via the API. Session expiry redirects to login.
  5. M1.5 -- Tests and hardening: Unit and integration tests pass. Security headers configured. CSRF protection tested. Rate limiting on auth endpoints verified.

Phase 2: Sources CRUD + Scraper Service

Goal

Add the custom sources management feature (CRUD, bulk import, CSV import/export) and build the URL scraper service that will be used during synthesis generation.

Deliverables

Backend:

  • Database migration: sources table
  • Sources API:
    • GET /api/v1/sources -- list user's sources
    • POST /api/v1/sources -- add a single source (title + URL, validated)
    • DELETE /api/v1/sources/:id -- delete a source (ownership check)
    • POST /api/v1/sources/bulk -- bulk import (JSON array)
    • POST /api/v1/sources/import-csv -- CSV import (multipart upload)
    • GET /api/v1/sources/export-csv -- CSV export download
  • Input validation: URL format validation, title length limits, max sources per user
  • Scraper service (services/scraper.rs):
    • reqwest HTTP client (shared from AppState, with timeouts: 5s connect, 15s response, 30s total)
    • SSRF prevention: DNS resolution check against private IP ranges, protocol restriction (http/https only), redirect validation
    • HTML parsing with scraper crate: soft-404 detection, publication date extraction (meta tags, JSON-LD, <time> elements), body text extraction (max 4000 chars, strip scripts/nav/footer)
    • Custom User-Agent header

Frontend:

  • Sources page:
    • List view with title and URL for each source
    • Add form: title + URL fields with validation feedback
    • Delete with standardized confirmation dialog (same pattern as settings)
    • Bulk import via textarea
    • CSV import (file picker) and export (download button)
  • Empty state with onboarding hint

Definition of Done

  • User can add, view, and delete custom sources
  • Bulk import (JSON and CSV) works correctly
  • CSV export downloads a valid file
  • URL validation rejects malformed URLs
  • Ownership isolation: user A cannot see or delete user B's sources
  • Scraper service can fetch a URL, parse HTML, detect soft-404s, extract publication dates, and extract body text
  • SSRF protection rejects requests to private/internal IPs

Dependencies

Phase 1 (auth, DB pool, frontend scaffolding, Docker setup)

Risk Factors

  1. Scraper robustness. Real-world HTML is messy. Publication date extraction from meta tags, JSON-LD, and <time> elements covers many sites but not all. Expect edge cases.
  2. SSRF prevention correctness. DNS rebinding attacks can bypass naive IP checks. The implementation must resolve DNS and check the IP before connecting, and re-check on redirects.
  3. CSV parsing. Malformed CSV files, encoding issues (UTF-8 BOM, Windows line endings), and large files can cause problems.

Testing Scope

  • Unit tests: URL validation, SSRF IP range checks, HTML parsing (soft-404 detection, date extraction, body text extraction -- use fixture HTML files), CSV parsing/generation
  • Integration tests: Full CRUD lifecycle for sources (create, list, delete), bulk import, CSV import/export, ownership isolation (user A cannot access user B's sources), SSRF rejection for private IPs

Milestones

  1. M2.1 -- Sources CRUD API complete: All endpoints working with auth. Integration tests pass.
  2. M2.2 -- Scraper service complete: Fetches, parses, validates URLs. SSRF protection in place. Unit tests with fixture HTML pass.
  3. M2.3 -- Frontend sources page complete: All interactions working, CSV import/export, empty state.

Phase 3: Admin Module

Goal

Build the admin interface for curating LLM providers/models and configuring rate limits. This phase establishes the provider/model catalog that users will select from in their settings.

Deliverables

Backend:

  • Database migrations: llm_providers table (provider name, display name, models JSON array, is_enabled, created_at, updated_at), rate_limits table (per-provider limits)
    • Note: In the decisions doc, users bring their own API keys. The llm_providers table here stores the admin-curated list of available providers and models, NOT API keys. User API keys are stored separately (see Phase 4).
  • Admin API (all require admin role):
    • GET /api/v1/admin/providers -- list all provider configs
    • POST /api/v1/admin/providers -- add/update a provider config (provider name, display name, list of enabled models)
    • DELETE /api/v1/admin/providers/:id -- remove a provider
    • GET /api/v1/admin/rate-limits -- get rate limit configs
    • PUT /api/v1/admin/rate-limits/:provider_id -- update rate limit config
    • GET /api/v1/admin/users -- list all users
    • PUT /api/v1/admin/users/:id/role -- change user role
  • Public endpoint (authenticated, non-admin):
    • GET /api/v1/config/providers -- list enabled providers and their model names (no sensitive data)
  • RequireAdmin middleware layer (checks user.role == "admin", returns 403 otherwise)
  • Rate limiter service: in-memory token-bucket per provider (using DashMap), configurable from admin, hot-reload on config change
  • Audit logging table and writes for admin actions

Frontend:

  • Admin layout at /admin (separate route prefix, sidebar navigation)
  • Admin nav: visible only to admin users (hidden from DOM for non-admins)
  • Provider configuration page (/admin/providers):
    • Card/tab per provider (Gemini, OpenAI, Anthropic)
    • Enable/disable toggle per provider
    • Model list management (checkboxes to enable/disable specific models)
    • Default model selection (dropdown)
    • Status indicators (configured/not configured)
  • Rate limit configuration page (/admin/rate-limits):
    • Per-provider rate limit fields (requests per minute)
    • Global limits
    • Save button
  • User management page (/admin/users):
    • User list with email, role, creation date
    • Role change (promote/demote admin)

Settings page update:

  • Replace the hardcoded AI model dropdown with a dynamic two-level selection:
    • Provider dropdown (populated from GET /api/v1/config/providers)
    • Model dropdown (populated based on selected provider)
  • If only one provider is configured, hide the provider dropdown

Definition of Done

  • Admin can add, configure, enable/disable providers and models
  • Admin can configure per-provider rate limits
  • Admin can view user list and change roles
  • Non-admin users cannot access admin pages (403 from API, routes hidden in UI)
  • GET /api/v1/config/providers returns the list of enabled providers and models
  • Settings page dynamically populates provider/model dropdowns from the admin config
  • Rate limiter enforces configured limits
  • Audit log records all admin actions

Dependencies

Phase 1 (auth with admin role, DB, frontend scaffolding)

Risk Factors

  1. Rate limiter complexity. In-memory state with hot-reload from DB requires careful concurrency handling (DashMap + atomic operations). Edge cases around config reload while requests are in flight.
  2. Admin UX complexity. The provider configuration page has many interacting elements (enable/disable, model list, default model). Getting the UX right takes iteration.
  3. Role-based access control. Must be watertight -- every admin endpoint must be protected both in the frontend (route guard) and backend (middleware). A missed check is a privilege escalation vulnerability.

Testing Scope

  • Unit tests: Rate limiter (token bucket logic, config reload), admin role check middleware
  • Integration tests: Admin CRUD for providers, rate limits. Non-admin access rejection (403). Role change. Audit log entries created. Public config endpoint returns correct data. Settings page provider/model population.

Milestones

  1. M3.1 -- Admin API complete: All admin endpoints working with role protection. Integration tests pass.
  2. M3.2 -- Rate limiter service: In-memory rate limiter with DB-backed config. Hot-reload tested.
  3. M3.3 -- Admin frontend complete: All admin pages functional. Non-admin users see no admin UI.
  4. M3.4 -- Settings page updated: Dynamic provider/model selection working.

Phase 4: LLM Provider Abstraction (Gemini First)

Goal

Implement the LLM provider trait and the first concrete implementation (Google Gemini), including user API key management. Prove that the abstraction works for structured output and web search grounding.

Deliverables

Backend:

  • Database migration: user_api_keys table (user_id, provider, encrypted_key using AES-256-GCM, nonce, key_prefix for display, created_at, updated_at)
  • API key encryption service:
    • Master key from MASTER_KEY_SECRET environment variable
    • AES-256-GCM encryption/decryption using aes-gcm crate
    • Per-key unique nonce via OsRng
    • Keys decrypted in memory only when making LLM calls, dropped immediately after
    • secrecy + zeroize crates for sensitive value handling
  • User API key endpoints:
    • GET /api/v1/user/api-keys -- list user's keys (provider + key_prefix only, never the full key)
    • POST /api/v1/user/api-keys -- add/update an API key for a provider
    • DELETE /api/v1/user/api-keys/:provider -- remove a key
    • POST /api/v1/user/api-keys/:provider/test -- test the key with a minimal LLM call
  • LlmProvider trait:
    provider_id() -> &str
    generate_search_pass(model, system_prompt, user_prompt, response_schema) -> Result<Value>
    generate_rewrite_pass(model, system_prompt, user_prompt, response_schema) -> Result<Value>
    
  • GeminiProvider implementation:
    • generateContent API call with googleSearch tool for Pass 1
    • Structured output via responseSchema + responseMimeType: "application/json"
    • Standard generation (no tools) for Pass 2
    • Dynamic category schema construction from user settings
  • Provider factory function: creates the correct provider implementation from config + user's decrypted API key

Frontend:

  • User API key management in Settings page:
    • Per-provider section showing key status (configured/not configured, key prefix)
    • Add/update key input (masked, with show/hide toggle)
    • Test button per provider (calls test endpoint, shows success/failure)
    • Delete key button
  • Warning display when a provider does not support web search grounding

Definition of Done

  • User can add, test, and remove their Gemini API key
  • API keys are encrypted at rest (AES-256-GCM) and never returned in full via the API
  • The LlmProvider trait is defined and GeminiProvider passes a manual test:
    • Pass 1: structured search results with googleSearch grounding
    • Pass 2: rewrite with structured output
  • Test endpoint validates the key works
  • Provider factory correctly creates a GeminiProvider from config + user key

Dependencies

  • Phase 2 (scraper service -- used in the pipeline validation)
  • Phase 3 (admin-curated provider/model list -- the factory reads from this)

Risk Factors

  1. Gemini API specifics. The responseSchema for structured JSON output + googleSearch tool configuration via the REST API (not a Rust SDK) requires careful request construction. Gemini's API versions and response formats can change.
  2. Encryption correctness. AES-256-GCM with per-key nonces must be implemented correctly. A nonce reuse with the same key breaks GCM security entirely. Using OsRng for nonce generation mitigates this.
  3. Structured output parsing. The dynamic schema (generated from user categories) must produce valid JSON Schema that Gemini accepts. Edge cases in category names (special characters, very long names) can break schema generation.
  4. API key security. The full lifecycle (transmit over HTTPS, encrypt at rest, decrypt in memory, drop after use) has multiple points where a mistake could leak keys (logging, error messages, debug output).

Testing Scope

  • Unit tests: AES-256-GCM encryption round-trip, key prefix extraction, dynamic schema generation from categories, provider factory (mocked), Gemini request/response serialization
  • Integration tests: User API key CRUD (verify encryption at rest, verify key is never returned in full), test endpoint (with a mock HTTP server standing in for Gemini), provider trait contract tests (mock implementation)
  • Manual test: End-to-end Gemini call with a real API key (not in CI, developer-run)

Milestones

  1. M4.1 -- User API key management: CRUD endpoints working, encryption at rest verified, frontend key management in Settings.
  2. M4.2 -- LlmProvider trait defined: Trait, types, factory function. Mock implementation for testing.
  3. M4.3 -- GeminiProvider working: Both passes (search + rewrite) produce valid structured output. Manual test with real API key succeeds.
  4. M4.4 -- Dynamic schema generation: Category-based schema construction tested with various category configurations.

Phase 5: Generation Pipeline + SSE Progress

Goal

Wire everything together into the full synthesis generation pipeline: user triggers generation, backend runs the two-pass pipeline (search -> scrape/validate -> rewrite), sends real-time progress via SSE, and saves the result. The user can view the synthesis.

Deliverables

Backend:

  • Database migration: syntheses table (user_id, week, sections JSON, status, created_at), generation_jobs table (ephemeral, or in-memory DashMap)
  • Generation pipeline orchestration (services/synthesis.rs):
    1. Load user settings and sources
    2. Resolve provider + model (from user's settings + admin config + user's API key)
    3. Build dynamic schema from categories
    4. Rate limit check (acquire slot)
    5. Pass 1: Search (via LlmProvider::generate_search_pass)
    6. Validate and scrape URLs (via scraper service, with SSRF protection)
    7. Rate limit check (acquire slot for Pass 2)
    8. Pass 2: Rewrite (via LlmProvider::generate_rewrite_pass with scraped content)
    9. Parse and validate structured output
    10. Save synthesis to database
  • Async generation API:
    • POST /api/v1/syntheses/generate -- triggers generation, returns immediately with job_id (202 Accepted)
    • GET /api/v1/syntheses/generate/:job_id/progress -- SSE endpoint streaming progress events
  • SSE progress events:
    • { step: "search", message: "Recherche d'actualites en cours...", percent: 10 }
    • { step: "scraping", message: "Verification des sources (3/12)...", percent: 40 }
    • { step: "rewrite", message: "Redaction des resumes...", percent: 75 }
    • { step: "saving", message: "Sauvegarde...", percent: 95 }
    • complete event with synthesis_id
    • error event with message
  • Syntheses API:
    • GET /api/v1/syntheses -- list user's syntheses (paginated, sorted by created_at desc)
    • GET /api/v1/syntheses/:id -- get synthesis detail
    • DELETE /api/v1/syntheses/:id -- delete a synthesis (ownership check, confirmation handled by frontend)
  • Job state management: in-memory DashMap<String, JobStatus> with TTL cleanup (jobs expire after 1 hour)
  • Prompt construction: system prompt and user prompt templates built from user settings (theme, categories, max age, search agent behavior, custom sources)

Frontend:

  • Home page (Dashboard):
    • Grid of synthesis cards (responsive: 1/2/3 columns)
    • Each card: week badge, creation date, preview of first section items (line-clamped)
    • Footer: "Lire la synthese" link, delete button with confirmation dialog
    • Empty state with onboarding hint
    • Banner when a generation is in progress ("Une generation est en cours...")
  • Generate page:
    • Confirmation text showing theme, age window, provider, model
    • "Lancer la generation" button
    • Progress bar with step descriptions (SSE-driven)
    • Step checklist (done/in-progress/pending)
    • "Vous pouvez quitter cette page" note
    • Error display with retry option
    • Auto-redirect to synthesis detail on completion
  • Synthesis detail page:
    • Section-by-section display: section title, then cards for each news item (title as external link, summary paragraph)
    • Back navigation
    • Delete button with confirmation dialog
  • SSE client: EventSource connection management, reconnection on disconnect, state synchronization if user navigates away and returns

Definition of Done

  • User clicks "Lancer la generation" and sees real-time progress via SSE
  • Generation runs asynchronously -- user can navigate away and return
  • Home page shows an in-progress banner during generation
  • On completion, the synthesis is saved and viewable
  • Synthesis detail shows all sections with items, titles as links, and summaries
  • User can delete syntheses
  • Ownership isolation: user A cannot view or delete user B's syntheses
  • Generation failures display an error message with context
  • Rate limiting prevents excessive generation requests

Dependencies

Phase 4 (LLM provider working with Gemini)

Risk Factors

  1. Pipeline reliability. The two-pass pipeline with scraping in between is complex. Failures at any stage (LLM timeout, scraping failure, invalid structured output) must be handled gracefully. Partial results (some URLs fail to scrape) should not abort the entire generation.
  2. SSE connection management. SSE connections can be dropped by reverse proxies, load balancers, or browser timeouts. The frontend must handle reconnection and state recovery. The backend must not leak resources (orphaned SSE connections, zombie tokio tasks).
  3. Structured output parsing. LLMs occasionally produce malformed JSON even with schema constraints. The pipeline must handle parsing failures gracefully (retry once, or fall back to best-effort extraction).
  4. Generation duration. End-to-end generation (2 LLM calls + N URL scrapes) can take 30-90+ seconds. The async model handles this, but progress reporting must be accurate (not fake percentages).
  5. Concurrent generation. What happens if a user triggers a second generation while one is running? Decision needed: reject with "already in progress" or queue.

Testing Scope

  • Unit tests: Prompt construction (from settings + sources), structured output parsing (valid and malformed JSON), job status management, SSE event serialization
  • Integration tests: Full generation pipeline with mocked LLM provider (returns canned structured output) and mocked scraper (returns canned HTML). Verify: correct DB state after generation, SSE events sequence, error handling (LLM failure, scraper failure). Syntheses CRUD with ownership isolation.
  • E2E test: Manual test with real Gemini API key. Full flow: configure settings, add sources, generate, view result.

Milestones

  1. M5.1 -- Syntheses CRUD: List, get, delete endpoints and frontend pages. Works with manually inserted test data.
  2. M5.2 -- Pipeline orchestration: Full two-pass pipeline runs synchronously (no SSE yet) with mocked LLM. Saves result to DB.
  3. M5.3 -- SSE progress: Async generation with SSE streaming. Frontend displays progress bar and step checklist.
  4. M5.4 -- Home page integration: In-progress banner, auto-refresh on completion, empty state.
  5. M5.5 -- End-to-end with real LLM: Manual test with Gemini. Prompt tuning. Error handling hardened.

Phase 6: Multi-Provider (OpenAI + Anthropic)

Goal

Add OpenAI and Anthropic as LLM providers, implementing the LlmProvider trait for each with their respective web search and structured output capabilities. The generation pipeline adapts per provider.

Deliverables

Backend:

  • OpenAiProvider implementation:
    • Pass 1: Uses OpenAI Responses API with web_search tool for grounded search results. Structured output via response_format: { type: "json_schema", json_schema: ... }.
    • Pass 2: Standard chat completion with structured JSON output.
    • Model mapping: validate user-selected model against admin-enabled models.
  • AnthropicProvider implementation:
    • Pass 1: Uses Claude's web_search tool for grounded results. Structured output via tool-use pattern (define a tool whose input schema matches the desired output, instruct Claude to call it).
    • Pass 2: Standard message with JSON output instructions. Server-side parsing and validation (Anthropic does not have native JSON schema enforcement as robust as Gemini/OpenAI).
  • Pipeline adaptation per provider:
    • Decision logic: if native web search grounding produces high-quality results (detected by checking citation count, URL validity), skip the scrape/rewrite pass.
    • If not, fall back to the full two-pass pipeline.
    • Provider-specific prompt adjustments (each provider responds differently to the same prompt structure).
  • Error handling per provider: different error codes, rate limit headers, and retry semantics for each provider's API.

Frontend:

  • Settings page: provider dropdown now populated with all admin-enabled providers (Gemini, OpenAI, Anthropic)
  • Generate page: warning when selected provider has limited web search capabilities
  • Provider-specific info text in settings ("La recherche web en temps reel est disponible avec ce fournisseur" vs "Les resultats seront bases sur les connaissances du modele")

Definition of Done

  • User can select OpenAI or Anthropic as their provider, add their API key, and generate a synthesis
  • Structured output is correctly parsed for all three providers
  • Web search grounding works for all three providers (using their respective tools)
  • Pipeline adapts per provider (skip scrape pass when native grounding is sufficient)
  • Error handling is provider-specific (correct error messages for quota exceeded, invalid key, model not available, etc.)
  • All existing Gemini functionality continues to work unchanged

Dependencies

Phase 5 (generation pipeline working end-to-end with Gemini)

Risk Factors

  1. Provider API differences are deeper than they appear. Each provider's web search tool returns results in different formats with different citation structures. The abstraction must handle this without becoming a leaky mess.
  2. Anthropic structured output. Claude does not have Gemini/OpenAI's level of JSON schema enforcement. The backend must handle parsing failures and potentially retry with clearer instructions.
  3. Testing across providers. Each provider requires a real API key for meaningful testing. Mocks can only go so far -- real provider behavior (latency, rate limits, output quality) varies.
  4. Pipeline adaptation heuristic. Deciding when to skip the scrape/rewrite pass is a quality judgment. Too aggressive skipping produces lower-quality summaries. Too conservative means the pipeline is always slow.

Testing Scope

  • Unit tests: OpenAI and Anthropic request/response serialization, provider-specific error mapping, pipeline adaptation logic
  • Integration tests: Full pipeline with mocked OpenAI and Anthropic HTTP responses. Verify structured output parsing, error handling, pipeline adaptation.
  • Manual tests: End-to-end generation with real OpenAI and Anthropic API keys. Quality comparison across providers.

Milestones

  1. M6.1 -- OpenAiProvider working: Both passes produce valid structured output. Manual test with real key.
  2. M6.2 -- AnthropicProvider working: Both passes produce valid structured output. Structured output parsing handles edge cases. Manual test with real key.
  3. M6.3 -- Pipeline adaptation: Provider-specific behavior (skip scrape when appropriate) implemented and tested.
  4. M6.4 -- Frontend updated: Provider selection, warnings, and info text. All three providers selectable.

Phase 7: Email (Resend) + Export (PDF/Markdown)

Goal

Add the ability to send a synthesis by email (via Resend) and export it as PDF or Markdown.

Deliverables

Backend:

  • Email sending service (services/email.rs -- extends the existing magic link email service):
    • POST /api/v1/syntheses/:id/send-email -- send synthesis to a specified email address
    • HTML email template: renders the synthesis (sections, items, links) as a formatted email
    • Plain-text fallback
    • Sender address configured via environment variable
    • Default recipient: the user's own email
  • Export service:
    • GET /api/v1/syntheses/:id/export/markdown -- returns the synthesis as a Markdown file download
    • GET /api/v1/syntheses/:id/export/pdf -- returns the synthesis as a PDF file download
    • Markdown generation: convert sections/items to Markdown format (headers, bullet points, links)
    • PDF generation: use a Rust PDF library (e.g., printpdf or genpdf, or convert Markdown to PDF via pulldown-cmark + a PDF renderer)

Frontend:

  • Synthesis detail page additions:
    • "Envoyer par email" button with email input (pre-filled with user's email)
    • "S'envoyer a soi-meme" quick button
    • Export dropdown: "Exporter en Markdown" and "Exporter en PDF"
    • Loading states and success/error feedback for email and export actions

Definition of Done

  • User can send a synthesis by email to any address (via Resend)
  • Email is well-formatted HTML with plain-text fallback
  • User can export a synthesis as Markdown (downloads a .md file)
  • User can export a synthesis as PDF (downloads a .pdf file)
  • Default email recipient is the user's own email
  • Ownership check: user can only email/export their own syntheses

Dependencies

Phase 1 (Resend integration already exists for magic links, auth). Best done after Phase 5 (so there is actual content to email/export).

Risk Factors

  1. PDF generation in Rust. The Rust PDF ecosystem is less mature than in other languages. genpdf is the most ergonomic option but has limited styling control. printpdf is low-level. Consider generating HTML and using a headless browser or wkhtmltopdf as a last resort (adds a Docker dependency).
  2. Email formatting. HTML emails are notoriously difficult to render consistently across email clients. Keep the template simple (tables-based layout, inline CSS, no external resources).
  3. Resend rate limits. The free tier has limits. Bulk email sending (e.g., user sends to a mailing list) could hit these. Rate limit email sends per user.

Testing Scope

  • Unit tests: Markdown generation from synthesis data, HTML email template rendering, PDF generation (verify output is valid PDF)
  • Integration tests: Email endpoint (mock Resend API), export endpoints (verify correct Content-Type and Content-Disposition headers, verify file content), ownership isolation

Milestones

  1. M7.1 -- Email sending: Backend endpoint working, HTML template rendered, Resend integration tested.
  2. M7.2 -- Markdown export: Endpoint returns correctly formatted .md file.
  3. M7.3 -- PDF export: Endpoint returns a valid PDF.
  4. M7.4 -- Frontend integration: Email and export buttons on synthesis detail page, with loading states and feedback.

Summary Table

Phase Name Core Capability Key Risk Estimated Relative Effort
1 Foundation Stack proof, auth, settings CRUD Rust learning curve + hand-rolled auth Very Large
2 Sources + Scraper Sources CRUD, URL scraping Scraper robustness, SSRF Medium
3 Admin Module Provider/model curation, rate limits Rate limiter concurrency, RBAC Medium
4 LLM Abstraction (Gemini) Provider trait, encryption, Gemini impl Structured output, API key security Large
5 Generation Pipeline + SSE End-to-end synthesis generation Pipeline reliability, SSE management Very Large
6 Multi-Provider OpenAI + Anthropic Provider API differences, quality Large
7 Email + Export Resend email, PDF/Markdown export PDF generation in Rust Small-Medium

Cross-Phase Concerns

These items are not isolated to a single phase but evolve incrementally:

Testing Strategy

  • Phase 1: Unit tests for core utilities, integration tests for auth flow. Establish CI pipeline (cargo test + clippy + cargo audit).
  • Phase 2-3: Integration tests grow with each CRUD module. Introduce fixture-based testing for scraper.
  • Phase 4: Add mock-based testing for external API calls. Provider trait contract tests.
  • Phase 5: First E2E tests (pipeline with mocked externals). SSE client testing.
  • Phase 6: Expand provider mocks. Cross-provider output comparison.
  • Phase 7: Template rendering tests. Full E2E manual test of the complete application.

Security Hardening

  • Phase 1: Auth, CSRF, CSP, security headers, rate limiting on auth endpoints, session security.
  • Phase 2: SSRF prevention in scraper.
  • Phase 3: RBAC, audit logging.
  • Phase 4: API key encryption at rest, secret handling (secrecy + zeroize).
  • Phase 5: Input sanitization for prompt injection (max lengths, delimiter patterns).
  • Phase 6: Per-provider error handling (avoid leaking provider API details to users).
  • Phase 7: Email input validation (prevent header injection).

i18n Readiness

  • All phases: user-facing strings go through the locale file (fr.ts). No hardcoded French strings in component logic.
  • Phase 1 establishes the pattern. Subsequent phases follow it.

Docker and Deployment

  • Phase 1 establishes the Dockerfile and docker-compose.yml.
  • Subsequent phases only add environment variables (documented in .env.example).
  • No deployment changes required between phases -- the same docker compose up works throughout.