ai_synth/docs/team-analysis/02-architecture-analysis.md

# Technical Architecture Analysis: AI Weekly Synth Refactoring

## Open Questions and Clarifications Needed

Before implementation, the following points require decisions from stakeholders:

1. **Admin scope**: Is the "admin" a single super-user defined by config, or a full role-based system with multiple admins? This analysis assumes a simple role flag on users plus a single bootstrap admin defined via environment variable.

2. **Google OAuth retention**: The requirements specify email+captcha and magic link auth. Should Google SSO be dropped entirely, or kept as an additional option? This analysis assumes Google SSO is dropped to remove all Google dependencies.

3. **Email sending for syntheses**: The current app sends syntheses via Gmail API with OAuth popup. With Google dependencies removed, should SMTP-based email sending replace this? This analysis assumes yes, using the same SMTP configuration as magic link delivery.

4. **Data migration volume**: How many existing users and syntheses need migrating? This impacts whether a one-shot script suffices or whether incremental migration tooling is needed.

5. **Concurrent users target**: Rate limiter design and session store choice depend on expected load. This analysis assumes a small-to-medium deployment (1-100 concurrent users).

6. **Legacy data**: The current `SynthesisData` has legacy fields (`majorAnnouncements`, `financialSector`, etc.). The requirements say "remove legacy data/formats/code." This analysis assumes legacy fields are dropped during migration; only the `sections[]` format is carried forward.

---

## 1. Rust Backend Architecture

### 1.1 Framework Choice: Axum

**Recommendation: Axum** over Actix-web.

**Justification:**

| Criterion | Axum | Actix-web |
|---|---|---|
| Ecosystem alignment | Built on `tokio` + `tower` + `hyper` -- the de-facto Rust async stack | Has its own runtime layer (though uses tokio underneath) |
| Middleware model | Tower `Layer`/`Service` -- composable, reusable, testable | Actor-based middleware -- powerful but idiosyncratic |
| Extractors | Type-safe, ergonomic, uses `FromRequest` traits | Similar, but with `web::Data`, `web::Json` wrappers |
| Community trajectory | Growing faster, backed by the tokio team | Mature, stable, but slower growth |
| Learning curve | Lower for developers already using tokio ecosystem | Slightly higher due to actor concepts |
| Compile-time type safety | Strong -- handler function signatures are validated at compile time | Strong, but less ergonomic error messages |

Axum's tower-based middleware model is a decisive advantage for this project: the auth middleware, rate limiter, and CORS layer compose naturally as tower `Layer`s. Axum also has first-class support for shared state via `State` extractor, which maps well to a shared database pool and configuration.

### 1.2 Project Structure

```
ai-synth-backend/
├── Cargo.toml
├── Cargo.lock
├── .env.example
├── migrations/                    # sqlx migrations
│   ├── 001_create_users.sql
│   ├── 002_create_sessions.sql
│   ├── 003_create_settings.sql
│   ├── 004_create_sources.sql
│   ├── 005_create_syntheses.sql
│   ├── 006_create_admin_config.sql
│   └── 007_create_rate_limits.sql
├── src/
│   ├── main.rs                    # Entry point: init tracing, DB, run server
│   ├── config.rs                  # Env-based configuration (envy / dotenvy)
│   ├── app_state.rs               # AppState struct (pool, config, http client)
│   ├── error.rs                   # AppError enum, IntoResponse impl
│   ├── router.rs                  # All route definitions, middleware wiring
│   ├── middleware/
│   │   ├── mod.rs
│   │   ├── auth.rs                # Session cookie extraction, user injection
│   │   ├── csrf.rs                # Double-submit cookie CSRF protection
│   │   └── rate_limit.rs          # Per-provider, configurable rate limiter
│   ├── models/
│   │   ├── mod.rs
│   │   ├── user.rs                # User, NewUser, UserRole
│   │   ├── session.rs             # Session
│   │   ├── settings.rs            # UserSettings
│   │   ├── source.rs              # Source
│   │   ├── synthesis.rs           # Synthesis, NewsSection, NewsItem
│   │   └── admin.rs               # LlmProviderConfig, RateLimitConfig
│   ├── handlers/
│   │   ├── mod.rs
│   │   ├── auth.rs                # register, login (magic link), verify, logout
│   │   ├── syntheses.rs           # list, get, create (trigger generation), delete
│   │   ├── sources.rs             # CRUD, bulk import, CSV export
│   │   ├── settings.rs            # get, update, export, import
│   │   ├── admin.rs               # LLM config CRUD, rate limit config, user list
│   │   └── email.rs               # Send synthesis by email
│   ├── services/
│   │   ├── mod.rs
│   │   ├── llm/
│   │   │   ├── mod.rs             # LlmProvider trait, factory function
│   │   │   ├── gemini.rs          # Google Gemini implementation
│   │   │   ├── openai.rs          # OpenAI implementation
│   │   │   ├── anthropic.rs       # Anthropic implementation
│   │   │   └── types.rs           # Shared request/response types
│   │   ├── synthesis.rs           # 2-pass generation pipeline orchestration
│   │   ├── scraper.rs             # URL validation, HTML scraping, date extraction
│   │   ├── email.rs               # SMTP email sending (magic links + syntheses)
│   │   └── captcha.rs             # Captcha verification
│   └── db/
│       ├── mod.rs
│       ├── users.rs               # User queries
│       ├── sessions.rs            # Session queries
│       ├── settings.rs            # Settings queries
│       ├── sources.rs             # Source queries
│       ├── syntheses.rs           # Synthesis queries
│       └── admin.rs               # Admin config queries
└── tests/
    ├── api/                       # Integration tests
    └── services/                  # Unit tests for services
```

### 1.3 Layered Architecture

The application follows a clean 3-layer architecture:

- **Handlers** (HTTP layer): Extract request data, call services, return responses. No business logic.
- **Services** (Business layer): Orchestrate operations, enforce business rules, call DB and external APIs.
- **DB** (Persistence layer): Raw sqlx queries, mapping to/from model structs.

### 1.4 Error Handling

A unified `AppError` enum implements `IntoResponse`:

```rust
#[derive(Debug)]
pub enum AppError {
    // Client errors
    BadRequest(String),
    Unauthorized(String),
    Forbidden(String),
    NotFound(String),
    Conflict(String),
    TooManyRequests { retry_after_secs: u64 },
    ValidationError(Vec<FieldError>),

    // Server errors
    Internal(anyhow::Error),
    LlmError(String),
    SmtpError(String),
    ScrapingError(String),
}

impl IntoResponse for AppError {
    fn into_response(self) -> axum::response::Response {
        let (status, message) = match &self {
            AppError::BadRequest(msg) => (StatusCode::BAD_REQUEST, msg.clone()),
            AppError::Unauthorized(_) => (StatusCode::UNAUTHORIZED, "Unauthorized".into()),
            AppError::Forbidden(_) => (StatusCode::FORBIDDEN, "Forbidden".into()),
            AppError::NotFound(msg) => (StatusCode::NOT_FOUND, msg.clone()),
            AppError::TooManyRequests { retry_after_secs } => {
                // Include Retry-After header
                (StatusCode::TOO_MANY_REQUESTS, format!("Retry after {retry_after_secs}s"))
            }
            AppError::Internal(e) => {
                tracing::error!("Internal error: {e:#}");
                (StatusCode::INTERNAL_SERVER_ERROR, "Internal server error".into())
            }
            // ...
        };
        (status, Json(json!({ "error": message }))).into_response()
    }
}
```

All handlers return `Result<impl IntoResponse, AppError>`. The `?` operator propagates errors naturally. `From` implementations convert `sqlx::Error`, `reqwest::Error`, etc. into `AppError`.

### 1.5 SQLite with sqlx: Schema Design

All tables use TEXT primary keys (UUIDs generated by the backend) for portability. Timestamps are stored as `TEXT` in ISO 8601 format (SQLite has no native timestamp; this also works on Postgres via `TIMESTAMPTZ` cast).

#### Migration 001: Users

```sql
CREATE TABLE users (
    id          TEXT PRIMARY KEY,           -- UUID
    email       TEXT NOT NULL UNIQUE,
    display_name TEXT,
    role        TEXT NOT NULL DEFAULT 'user', -- 'user' | 'admin'
    created_at  TEXT NOT NULL,              -- ISO 8601
    updated_at  TEXT NOT NULL
);
CREATE INDEX idx_users_email ON users(email);
```

#### Migration 002: Sessions

```sql
CREATE TABLE sessions (
    id          TEXT PRIMARY KEY,           -- Secure random token (32 bytes, base64url)
    user_id     TEXT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    created_at  TEXT NOT NULL,
    expires_at  TEXT NOT NULL,
    ip_address  TEXT,
    user_agent  TEXT
);
CREATE INDEX idx_sessions_user_id ON sessions(user_id);
CREATE INDEX idx_sessions_expires_at ON sessions(expires_at);
```

#### Migration 003: Settings

```sql
CREATE TABLE settings (
    user_id              TEXT PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE,
    theme                TEXT NOT NULL DEFAULT 'Intelligence Artificielle',
    max_age_days         INTEGER NOT NULL DEFAULT 7,
    categories           TEXT NOT NULL,    -- JSON array stored as TEXT
    max_items_per_category INTEGER NOT NULL DEFAULT 4,
    search_agent_behavior TEXT NOT NULL DEFAULT '',
    ai_model             TEXT NOT NULL DEFAULT 'gemini-3.1-pro-preview',
    updated_at           TEXT NOT NULL
);
```

#### Migration 004: Sources

```sql
CREATE TABLE sources (
    id          TEXT PRIMARY KEY,
    user_id     TEXT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    title       TEXT NOT NULL,
    url         TEXT NOT NULL,
    created_at  TEXT NOT NULL
);
CREATE INDEX idx_sources_user_id ON sources(user_id);
```

#### Migration 005: Syntheses

```sql
CREATE TABLE syntheses (
    id          TEXT PRIMARY KEY,
    user_id     TEXT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    week        TEXT NOT NULL,             -- e.g. "2026-W12"
    sections    TEXT NOT NULL,             -- JSON: [{ title, items: [{ title, url, summary }] }]
    created_at  TEXT NOT NULL
);
CREATE INDEX idx_syntheses_user_id ON syntheses(user_id);
CREATE INDEX idx_syntheses_created_at ON syntheses(created_at);
```

#### Migration 006: Admin Config (LLM Providers)

```sql
CREATE TABLE llm_providers (
    id           TEXT PRIMARY KEY,
    provider     TEXT NOT NULL,            -- 'gemini' | 'openai' | 'anthropic'
    display_name TEXT NOT NULL,
    api_key      TEXT NOT NULL,            -- Encrypted at rest (AES-256-GCM)
    base_url     TEXT,                     -- Optional override for self-hosted/proxy
    models       TEXT NOT NULL,            -- JSON array of available model identifiers
    is_enabled   BOOLEAN NOT NULL DEFAULT 1,
    created_at   TEXT NOT NULL,
    updated_at   TEXT NOT NULL,
    UNIQUE(provider)
);
```

#### Migration 007: Rate Limit Configuration

```sql
CREATE TABLE rate_limits (
    id              TEXT PRIMARY KEY,
    provider_id     TEXT NOT NULL REFERENCES llm_providers(id) ON DELETE CASCADE,
    max_requests    INTEGER NOT NULL DEFAULT 29,
    time_window_ms  INTEGER NOT NULL DEFAULT 60000,
    updated_at      TEXT NOT NULL,
    UNIQUE(provider_id)
);

-- Magic link rate limiting
CREATE TABLE magic_link_tokens (
    id          TEXT PRIMARY KEY,
    email       TEXT NOT NULL,
    token_hash  TEXT NOT NULL,             -- SHA-256 of the token
    created_at  TEXT NOT NULL,
    expires_at  TEXT NOT NULL,
    used        BOOLEAN NOT NULL DEFAULT 0
);
CREATE INDEX idx_magic_link_email ON magic_link_tokens(email);
```

### 1.6 SQLite/Postgres Dual Compatibility Strategy

**Recommendation: Use sqlx with runtime database selection via `sqlx::AnyPool`.**

However, `AnyPool` has limitations (no compile-time query checking). A more robust approach:

**Strategy: Feature-flag based conditional compilation.**

```toml
# Cargo.toml
[features]
default = ["sqlite"]
sqlite = ["sqlx/sqlite"]
postgres = ["sqlx/postgres"]
```

For this project, the SQL differences between SQLite and Postgres are minimal:

| Concern | SQLite | Postgres | Resolution |
|---|---|---|---|
| Auto-increment PK | `INTEGER PRIMARY KEY` | `SERIAL` | Use UUID TEXT PKs -- identical on both |
| Timestamps | `TEXT` (ISO 8601) | `TIMESTAMPTZ` | Store as TEXT on both; parse in application layer |
| JSON columns | `TEXT` + app-side JSON parse | `JSONB` | Store as TEXT on both; Postgres can migrate to JSONB later |
| Boolean | `INTEGER` (0/1) | `BOOLEAN` | Use `INTEGER` on SQLite, `BOOLEAN` on Postgres; sqlx handles mapping |
| RETURNING clause | Supported since SQLite 3.35 | Supported | Use `RETURNING` on both |

**Practical approach for v1**: Target SQLite only. Write SQL that is Postgres-compatible by design (UUID text PKs, ISO timestamps, no SQLite-specific functions). When the Postgres upgrade happens, create a parallel `migrations_pg/` folder and swap the connection pool. The query layer (db/) remains identical because all queries use standard SQL.

Compile-time checking is preserved by using `sqlx::query!` and `sqlx::query_as!` macros with the `DATABASE_URL` environment variable pointing to an SQLite file during development.

---

## 2. API Design

### 2.1 REST API Endpoints

All endpoints prefixed with `/api/v1`. Request and response bodies are JSON unless stated otherwise.

#### Authentication

| Method | Path | Auth | Description |
|---|---|---|---|
| `POST` | `/auth/register` | No | Create account (email + captcha) |
| `POST` | `/auth/login` | No | Request magic link (email + captcha) |
| `GET`  | `/auth/verify?token=...` | No | Verify magic link token, create session |
| `POST` | `/auth/logout` | Yes | Invalidate session |
| `GET`  | `/auth/me` | Yes | Get current user info |

#### Syntheses

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET`    | `/syntheses` | Yes | List user's syntheses (paginated) |
| `GET`    | `/syntheses/:id` | Yes | Get synthesis detail |
| `POST`   | `/syntheses/generate` | Yes | Trigger generation (async, returns job ID) |
| `GET`    | `/syntheses/generate/:job_id/status` | Yes | Poll generation status |
| `DELETE` | `/syntheses/:id` | Yes | Delete a synthesis |
| `POST`   | `/syntheses/:id/email` | Yes | Send synthesis by email |

#### Sources

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET`    | `/sources` | Yes | List user's sources |
| `POST`   | `/sources` | Yes | Add a source |
| `DELETE` | `/sources/:id` | Yes | Delete a source |
| `POST`   | `/sources/bulk` | Yes | Bulk import (JSON array) |
| `POST`   | `/sources/import-csv` | Yes | Import from CSV (multipart upload) |
| `GET`    | `/sources/export-csv` | Yes | Export as CSV download |

#### Settings

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET`  | `/settings` | Yes | Get user's settings |
| `PUT`  | `/settings` | Yes | Update settings |
| `GET`  | `/settings/export` | Yes | Export as JSON download |
| `POST` | `/settings/import` | Yes | Import from JSON |

#### Admin

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET`    | `/admin/providers` | Admin | List LLM provider configs |
| `POST`   | `/admin/providers` | Admin | Add/update provider config |
| `DELETE` | `/admin/providers/:id` | Admin | Remove provider |
| `GET`    | `/admin/rate-limits` | Admin | Get rate limit configs |
| `PUT`    | `/admin/rate-limits/:provider_id` | Admin | Update rate limit config |
| `GET`    | `/admin/users` | Admin | List all users |
| `PUT`    | `/admin/users/:id/role` | Admin | Change user role |

#### Public (for frontend config)

| Method | Path | Auth | Description |
|---|---|---|---|
| `GET` | `/config/providers` | Yes | List enabled providers + their model names (no API keys) |

### 2.2 Request/Response Shapes

**POST /auth/register**
```json
// Request
{
  "email": "user@example.com",
  "display_name": "Jane Doe",
  "captcha_token": "hcaptcha-response-token"
}
// Response 200
{
  "message": "A verification link has been sent to your email."
}
```

**POST /syntheses/generate**
```json
// Request (empty body -- uses user's saved settings and sources)
{}
// Response 202
{
  "job_id": "uuid-of-generation-job",
  "status": "pending"
}
```

**GET /syntheses/:id**
```json
// Response 200
{
  "id": "uuid",
  "week": "2026-W12",
  "created_at": "2026-03-21T10:30:00Z",
  "sections": [
    {
      "title": "Annonces majeures",
      "items": [
        {
          "title": "Article title",
          "url": "https://example.com/article",
          "summary": "4-5 line summary..."
        }
      ]
    }
  ]
}
```

**PUT /settings**
```json
// Request
{
  "theme": "Intelligence Artificielle",
  "max_age_days": 7,
  "categories": ["Annonces majeures", "Secteur financier"],
  "max_items_per_category": 4,
  "search_agent_behavior": "Custom instructions...",
  "ai_model": "gemini-3.1-pro-preview"
}
// Response 200
{
  "message": "Settings updated successfully."
}
```

**POST /admin/providers**
```json
// Request
{
  "provider": "openai",
  "display_name": "OpenAI GPT-4o",
  "api_key": "sk-...",
  "base_url": null,
  "models": ["gpt-4o", "gpt-4o-mini"],
  "is_enabled": true
}
```

### 2.3 Authentication Middleware

The auth middleware is a tower `Layer` that:

1. Extracts the session cookie (`ai_synth_session`) from the request.
2. Looks up the session ID in the `sessions` table.
3. Checks `expires_at` has not passed.
4. Loads the `User` from the `users` table.
5. Injects the `User` into request extensions (`request.extensions_mut().insert(user)`).
6. Handlers extract the user via `Extension<User>` or a custom `AuthUser` extractor.

For admin routes, an additional `RequireAdmin` layer checks `user.role == "admin"`.

**Session cookies configuration:**

```rust
Cookie::build(("ai_synth_session", session_id))
    .http_only(true)
    .secure(true)           // HTTPS only
    .same_site(SameSite::Lax)
    .path("/")
    .max_age(Duration::days(30))
```

**CSRF Protection:**

Since this is an API consumed by a SPA on the same origin (or proxied), the combination of `SameSite=Lax` cookies and requiring a custom header (`X-Requested-With: XMLHttpRequest`) on mutating requests provides sufficient CSRF protection. This is the "custom header" pattern -- browsers will not send custom headers on cross-origin requests without CORS preflight approval.

For the SPA, every `fetch` call to the API includes:
```javascript
headers: { "X-Requested-With": "XMLHttpRequest" }
```

The CSRF middleware rejects `POST/PUT/DELETE` requests missing this header.

---

## 3. LLM Provider Abstraction

### 3.1 Trait Design

```rust
#[async_trait]
pub trait LlmProvider: Send + Sync {
    /// Returns the provider identifier (e.g., "gemini", "openai", "anthropic").
    fn provider_id(&self) -> &str;

    /// Pass 1: Search the web and generate structured news items.
    /// Returns raw JSON matching the category schema.
    async fn generate_search_pass(
        &self,
        model: &str,
        system_prompt: &str,
        user_prompt: &str,
        response_schema: &serde_json::Value,
    ) -> Result<serde_json::Value, AppError>;

    /// Pass 2: Rewrite titles and summaries based on scraped content.
    /// No web search tool needed.
    async fn generate_rewrite_pass(
        &self,
        model: &str,
        system_prompt: &str,
        user_prompt: &str,
        response_schema: &serde_json::Value,
    ) -> Result<serde_json::Value, AppError>;

    /// Lists available models for this provider.
    fn available_models(&self) -> &[String];
}
```

### 3.2 Provider-Specific Web Search Handling

Each provider handles web grounding differently. The trait design abstracts this:

| Provider | Pass 1 (Search) | Pass 2 (Rewrite) |
|---|---|---|
| **Gemini** | Uses `googleSearch` tool in config. Structured output via `responseSchema`. | Standard generation, no tools. `responseSchema` for structured output. |
| **OpenAI** | Uses `web_search` tool (Responses API) or a two-step approach: first call with `browsing` tool, then structured output. | Standard chat completion with `response_format: { type: "json_schema", ... }`. |
| **Anthropic** | Uses `web_search` tool (available on Claude models). Structured output via tool-use pattern or explicit JSON instructions. | Standard message with JSON output instructions. Anthropic does not have native JSON schema enforcement, so the prompt includes the schema and parsing is done server-side with validation. |

**Implementation details for each provider:**

```rust
// Gemini implementation
pub struct GeminiProvider {
    client: reqwest::Client,
    api_key: String,
    base_url: String,
    models: Vec<String>,
}

impl GeminiProvider {
    async fn generate_search_pass(&self, model: &str, ...) -> Result<serde_json::Value, AppError> {
        // POST to /v1beta/models/{model}:generateContent
        // Config includes: tools: [{ googleSearch: {} }]
        //                  responseMimeType: "application/json"
        //                  responseSchema: <schema>
    }
}

// OpenAI implementation
pub struct OpenAiProvider {
    client: reqwest::Client,
    api_key: String,
    base_url: String,  // default: https://api.openai.com/v1
    models: Vec<String>,
}

// Anthropic implementation
pub struct AnthropicProvider {
    client: reqwest::Client,
    api_key: String,
    base_url: String,  // default: https://api.anthropic.com
    models: Vec<String>,
}
```

### 3.3 Provider Factory

```rust
pub fn create_provider(config: &LlmProviderConfig) -> Result<Box<dyn LlmProvider>, AppError> {
    match config.provider.as_str() {
        "gemini" => Ok(Box::new(GeminiProvider::new(
            config.api_key.clone(),
            config.base_url.clone(),
            config.models.clone(),
        ))),
        "openai" => Ok(Box::new(OpenAiProvider::new(...))),
        "anthropic" => Ok(Box::new(AnthropicProvider::new(...))),
        _ => Err(AppError::BadRequest(format!("Unknown provider: {}", config.provider))),
    }
}
```

### 3.4 Rate Limiter Design

The rate limiter is a server-side, per-provider, in-memory token bucket with configuration stored in the database.

```rust
pub struct RateLimiter {
    state: Arc<DashMap<String, ProviderBucket>>,
}

struct ProviderBucket {
    timestamps: VecDeque<Instant>,
    max_requests: u32,
    time_window: Duration,
}

impl RateLimiter {
    /// Blocks until a slot is available for the given provider.
    pub async fn acquire(&self, provider_id: &str) -> Result<(), AppError> {
        loop {
            let mut bucket = self.state
                .entry(provider_id.to_string())
                .or_insert_with(|| self.default_bucket());

            bucket.timestamps.retain(|t| t.elapsed() < bucket.time_window);

            if bucket.timestamps.len() < bucket.max_requests as usize {
                bucket.timestamps.push_back(Instant::now());
                return Ok(());
            }

            let wait_time = bucket.time_window - bucket.timestamps.front().unwrap().elapsed();
            drop(bucket); // Release the DashMap lock before sleeping
            tokio::time::sleep(wait_time).await;
        }
    }

    /// Reload configuration from DB (called by admin update endpoint).
    pub async fn reload_config(&self, pool: &SqlitePool) -> Result<(), AppError> {
        // Fetch rate_limits table, update each ProviderBucket
    }
}
```

The rate limiter lives in `AppState` and is shared across all requests. When an admin updates rate limit configuration, `reload_config` is called to hot-reload without restart.

### 3.5 Two-Pass Generation Pipeline

The `SynthesisService` orchestrates the full pipeline:

```rust
pub struct SynthesisService;

impl SynthesisService {
    pub async fn generate(
        state: &AppState,
        user_id: &str,
    ) -> Result<Synthesis, AppError> {
        // 1. Load user settings
        let settings = db::settings::get(pool, user_id).await?;

        // 2. Load user sources
        let sources = db::sources::list(pool, user_id).await?;

        // 3. Resolve LLM provider + model
        let (provider, model) = resolve_provider(state, &settings.ai_model).await?;

        // 4. Build dynamic schema from categories
        let schema = build_category_schema(&settings.categories);

        // 5. Rate limit: acquire slot
        state.rate_limiter.acquire(provider.provider_id()).await?;

        // 6. Pass 1: Search
        let raw_results = provider.generate_search_pass(
            &model, &system_prompt, &user_prompt, &schema
        ).await?;

        // 7. Validate & scrape URLs (server-side, no CORS issues)
        let scraped = scraper::validate_and_scrape(
            &state.http_client,
            raw_results,
            settings.max_age_days,
        ).await;

        // 8. Rate limit: acquire slot for pass 2
        state.rate_limiter.acquire(provider.provider_id()).await?;

        // 9. Pass 2: Rewrite with scraped content
        let final_results = provider.generate_rewrite_pass(
            &model, &rewrite_system_prompt, &rewrite_prompt, &schema
        ).await?;

        // 10. Persist
        let synthesis = db::syntheses::create(
            pool, user_id, &week_string, &final_results
        ).await?;

        Ok(synthesis)
    }
}
```

### 3.6 Asynchronous Generation

Synthesis generation can take 30-90 seconds. Two options:

**Option A: Synchronous with long timeout.** Simple, but ties up a connection. Acceptable for low-traffic deployments.

**Option B (Recommended): Background task with polling.** The `POST /syntheses/generate` endpoint spawns a tokio task and returns a job ID. The frontend polls `GET /syntheses/generate/:job_id/status`. Job state is kept in an in-memory `DashMap<String, JobStatus>` (not in DB, since jobs are ephemeral).

```rust
enum JobStatus {
    Pending,
    InProgress { step: String },  // "search", "scraping", "rewriting"
    Completed { synthesis_id: String },
    Failed { error: String },
}
```

The frontend polls every 3-5 seconds with the same loading UX as the current React app.

---

## 4. URL Scraping / Validation

### 4.1 CORS Elimination

Moving scraping to the backend **completely eliminates CORS issues**. The Rust backend makes direct HTTP requests to target URLs -- no proxies needed. This is the single biggest reliability improvement in the refactoring.

### 4.2 reqwest-Based HTTP Client

```rust
let client = reqwest::Client::builder()
    .user_agent("Mozilla/5.0 (compatible; AISynthBot/1.0; +https://your-domain.com/bot)")
    .timeout(Duration::from_secs(15))
    .redirect(reqwest::redirect::Policy::limited(5))
    .connect_timeout(Duration::from_secs(5))
    .danger_accept_invalid_certs(false)
    .build()?;
```

The HTTP client is created once in `AppState` and reused across all requests (connection pooling).

### 4.3 HTML Parsing with `scraper` Crate

The current app uses the browser's `DOMParser`. The Rust equivalent uses the `scraper` crate (built on `html5ever`):

```rust
use scraper::{Html, Selector};

pub async fn validate_and_scrape(
    client: &reqwest::Client,
    items: Vec<RawNewsItem>,
    max_age_days: i64,
) -> Vec<ScrapedNewsItem> {
    let futures = items.into_iter().map(|item| {
        let client = client.clone();
        async move { scrape_single(&client, item, max_age_days).await }
    });

    let results = futures::future::join_all(futures).await;
    results.into_iter().filter_map(|r| r).collect()
}

async fn scrape_single(
    client: &reqwest::Client,
    item: RawNewsItem,
    max_age_days: i64,
) -> Option<ScrapedNewsItem> {
    // 1. Validate URL format
    let url = Url::parse(&item.url).ok()?;

    // 2. Fetch
    let resp = client.get(url).send().await.ok()?;
    if !resp.status().is_success() { return None; }
    let html_text = resp.text().await.ok()?;

    // 3. Parse HTML
    let document = Html::parse_document(&html_text);

    // 4. Soft-404 detection
    let title_sel = Selector::parse("title").unwrap();
    let h1_sel = Selector::parse("h1").unwrap();
    let title_text = document.select(&title_sel).next()
        .map(|el| el.text().collect::<String>().to_lowercase())
        .unwrap_or_default();
    let h1_text = document.select(&h1_sel).next()
        .map(|el| el.text().collect::<String>().to_lowercase())
        .unwrap_or_default();

    let error_keywords = [
        "page not found", "404", "403", "access denied",
        "forbidden", "not found", "introuvable",
    ];
    if error_keywords.iter().any(|kw| title_text.contains(kw) || h1_text.contains(kw)) {
        return None;
    }

    // 5. Date extraction (meta tags, JSON-LD, <time>)
    if let Some(pub_date) = extract_publication_date(&document) {
        let age = Utc::now() - pub_date;
        if age.num_days() > max_age_days {
            return None;
        }
    }

    // 6. Extract body text (remove script, style, nav, etc.)
    let content = extract_body_text(&document, 4000);

    Some(ScrapedNewsItem {
        title: item.title,
        url: item.url,
        summary: item.summary,
        scraped_content: content,
    })
}
```

**Date extraction** mirrors the current logic: check `meta[property="article:published_time"]`, `meta[itemprop="datePublished"]`, `<time datetime>`, and JSON-LD `datePublished`. The `chrono` crate handles date parsing with multiple format attempts.

### 4.4 Concurrency Control

To avoid overwhelming target sites, scraping runs with bounded concurrency:

```rust
use futures::stream::{self, StreamExt};

stream::iter(items)
    .map(|item| scrape_single(&client, item, max_age_days))
    .buffer_unordered(10)  // Max 10 concurrent scrapes
    .collect::<Vec<_>>()
    .await
```

---

## 5. SolidJS Frontend

### 5.1 Build Tooling

SolidJS uses Vite natively. The migration is straightforward:

```js
// vite.config.ts
import { defineConfig } from 'vite';
import solidPlugin from 'vite-plugin-solid';
import tailwindcss from '@tailwindcss/vite';

export default defineConfig({
  plugins: [solidPlugin(), tailwindcss()],
  server: {
    port: 3000,
    proxy: {
      '/api': 'http://localhost:8080',  // Proxy to Rust backend during dev
    },
  },
  build: {
    target: 'esnext',
  },
});
```

**package.json dependencies:**
```json
{
  "dependencies": {
    "solid-js": "^1.9",
    "@solidjs/router": "^0.15",
    "lucide-solid": "^0.450",
    "date-fns": "^4.1"
  },
  "devDependencies": {
    "vite": "^6.2",
    "vite-plugin-solid": "^2.11",
    "@tailwindcss/vite": "^4.1",
    "tailwindcss": "^4.1",
    "typescript": "^5.8"
  }
}
```

### 5.2 State Management: React to SolidJS Mapping

| React Pattern | SolidJS Equivalent | Notes |
|---|---|---|
| `useState(value)` | `createSignal(value)` | Returns `[getter, setter]` -- getter is a function call: `count()` |
| `useEffect(() => {}, [deps])` | `createEffect(() => {})` | Auto-tracks dependencies, no dep array needed |
| `useContext(Ctx)` | `useContext(Ctx)` | Nearly identical API |
| `createContext()` | `createContext()` | Same concept |
| `React.FC<Props>` | `Component<Props>` | `import { Component } from 'solid-js'` |
| `{items.map(i => ...)}` | `<For each={items()}>{(item) => ...}</For>` | SolidJS uses `<For>` for efficient list rendering |
| `{condition && <X/>}` | `<Show when={condition()}><X/></Show>` | `<Show>` avoids unnecessary DOM creation |
| `useNavigate()` | `useNavigate()` | Same API from `@solidjs/router` |
| `useParams()` | `useParams()` | Same API |
| `onSnapshot` (realtime) | `createResource` + polling or SSE | SolidJS does not have a Firestore equivalent; use `createResource` for data fetching |

### 5.3 Authentication Context Port

```tsx
// src/context/AuthContext.tsx
import { createContext, useContext, createSignal, createResource, ParentComponent } from 'solid-js';

interface User {
  id: string;
  email: string;
  display_name: string | null;
  role: string;
}

interface AuthContextType {
  user: () => User | null | undefined;
  loading: () => boolean;
  logout: () => Promise<void>;
}

const AuthContext = createContext<AuthContextType>();

async function fetchCurrentUser(): Promise<User | null> {
  const resp = await fetch('/api/v1/auth/me', {
    headers: { 'X-Requested-With': 'XMLHttpRequest' },
    credentials: 'include',
  });
  if (resp.status === 401) return null;
  if (!resp.ok) throw new Error('Failed to fetch user');
  return resp.json();
}

export const AuthProvider: ParentComponent = (props) => {
  const [user, { refetch }] = createResource(fetchCurrentUser);

  const logout = async () => {
    await fetch('/api/v1/auth/logout', {
      method: 'POST',
      headers: { 'X-Requested-With': 'XMLHttpRequest' },
      credentials: 'include',
    });
    refetch();
  };

  return (
    <AuthContext.Provider value={{
      user: () => user(),
      loading: () => user.loading,
      logout,
    }}>
      {props.children}
    </AuthContext.Provider>
  );
};

export const useAuth = () => {
  const ctx = useContext(AuthContext);
  if (!ctx) throw new Error('useAuth must be used within AuthProvider');
  return ctx;
};
```

### 5.4 Data Fetching Pattern

The current React app uses Firestore's `onSnapshot` for real-time updates. With the REST API backend, data fetching uses `createResource`:

```tsx
// src/pages/Home.tsx
import { createResource, For, Show } from 'solid-js';
import { A } from '@solidjs/router';
import { fetchApi } from '../lib/api';

async function fetchSyntheses() {
  return fetchApi<SynthesisDocument[]>('/api/v1/syntheses');
}

export default function Home() {
  const [syntheses, { refetch }] = createResource(fetchSyntheses);

  return (
    <Show when={!syntheses.loading} fallback={<Spinner />}>
      <For each={syntheses()}>
        {(synth) => (
          <A href={`/synthesis/${synth.id}`}>
            {/* card content */}
          </A>
        )}
      </For>
    </Show>
  );
}
```

### 5.5 Tailwind CSS Compatibility

Tailwind CSS v4 works identically with SolidJS. The `@tailwindcss/vite` plugin scans `.tsx` files for class names regardless of framework. All existing Tailwind classes carry over without changes. The `lucide-solid` package provides the same icon components as `lucide-react` with identical APIs.

### 5.6 Routing

```tsx
// src/App.tsx
import { Router, Route } from '@solidjs/router';
import { AuthProvider } from './context/AuthContext';

function App() {
  return (
    <AuthProvider>
      <Router>
        <Route path="/login" component={Login} />
        <Route path="/" component={ProtectedLayout}>
          <Route path="/" component={Home} />
          <Route path="/sources" component={Sources} />
          <Route path="/settings" component={Settings} />
          <Route path="/generate" component={GenerateSynthesis} />
          <Route path="/synthesis/:id" component={SynthesisDetail} />
        </Route>
      </Router>
    </AuthProvider>
  );
}
```

The `ProtectedLayout` component checks auth and renders `<Navigate>` if not logged in -- same pattern as the current React `ProtectedRoute` but using SolidJS's `<Navigate>`.

---

## 6. Authentication System

### 6.1 Magic Link Flow

```
User                    Frontend           Backend            SMTP Server
 |                        |                   |                    |
 |-- Enter email -------->|                   |                    |
 |                        |-- POST /auth/login -->                |
 |                        |   { email, captcha_token }            |
 |                        |                   |-- verify captcha ->|
 |                        |                   |-- generate token   |
 |                        |                   |-- store hash in DB |
 |                        |                   |-- send email ------+-->
 |                        |<-- 200 "Check email" |                |
 |                        |                   |                    |
 |<---- Email arrives (link: /auth/verify?token=xxx) -------------|
 |                        |                   |                    |
 |-- Click link --------->|                   |                    |
 |                        |-- GET /auth/verify?token=xxx -->      |
 |                        |                   |-- hash token       |
 |                        |                   |-- lookup in DB     |
 |                        |                   |-- verify not expired|
 |                        |                   |-- mark as used     |
 |                        |                   |-- create/get user  |
 |                        |                   |-- create session   |
 |                        |<-- 302 redirect + Set-Cookie          |
 |<-- Redirect to / ------|                   |                    |
```

**Token generation:**
- 32 bytes of cryptographically secure random data (`rand::rngs::OsRng`)
- Base64url encoded for URL safety
- SHA-256 hash stored in DB (never store raw token)
- 15-minute expiry
- Single use (marked `used = true` after verification)

**Rate limiting on magic link requests:**
- Max 3 requests per email per 15 minutes
- Max 10 requests per IP per hour
- Prevents email bombing

### 6.2 Account Registration Flow

1. User submits email + display name + captcha token.
2. Backend verifies captcha with provider.
3. Backend checks email uniqueness.
4. Backend creates user with `role = 'user'` and default settings.
5. Backend sends magic link email for initial verification.
6. User clicks link, session is created.

The first user can be bootstrapped as admin via environment variable:
```
ADMIN_EMAIL=admin@example.com
```
On startup, if a user with this email exists, their role is set to `admin`.

### 6.3 Session Management

Sessions are stored in the `sessions` table. The session ID is a 32-byte random token (base64url-encoded, 43 characters). Session lookup is O(1) via primary key.

**Session lifecycle:**
- Created on magic link verification
- Expires after 30 days (configurable)
- Refreshed (expiry extended) on each authenticated request
- Deleted on logout
- Periodic cleanup job (tokio interval) removes expired sessions

### 6.4 Captcha Integration

**Recommendation: Cloudflare Turnstile.**

| Option | Self-hostable | Privacy | Free tier |
|---|---|---|---|
| hCaptcha | No (SaaS) | Better than reCAPTCHA | Yes (unlimited) |
| Cloudflare Turnstile | No (SaaS) | Excellent (often invisible) | Yes (unlimited) |
| mCaptcha | Yes (open source) | Full control | N/A (self-hosted) |

None of the mainstream captcha services are fully self-hostable. **Cloudflare Turnstile** is recommended for its invisible challenge mode (better UX) and generous free tier. If strict self-hosting is required, **mCaptcha** (Rust-based, open source) is the only viable option, though it requires running a separate service.

Backend verification is simple:
```rust
pub async fn verify_captcha(client: &reqwest::Client, token: &str, secret: &str) -> Result<bool, AppError> {
    let resp = client
        .post("https://challenges.cloudflare.com/turnstile/v0/siteverify")
        .form(&[("secret", secret), ("response", token)])
        .send()
        .await?;
    let result: TurnstileResponse = resp.json().await?;
    Ok(result.success)
}
```

---

## 7. Docker Deployment

### 7.1 Multi-Stage Dockerfile

```dockerfile
# ===== Stage 1: Build Rust backend =====
FROM rust:1.85-bookworm AS backend-builder

WORKDIR /app
COPY Cargo.toml Cargo.lock ./
COPY src/ src/
COPY migrations/ migrations/

# Create a dummy SQLite DB for sqlx compile-time checks
ENV DATABASE_URL="sqlite:///tmp/build.db"
RUN cargo install sqlx-cli --no-default-features --features sqlite \
    && sqlx database create \
    && sqlx migrate run

RUN cargo build --release

# ===== Stage 2: Build SolidJS frontend =====
FROM node:22-alpine AS frontend-builder

WORKDIR /app/frontend
COPY frontend/package.json frontend/package-lock.json ./
RUN npm ci

COPY frontend/ ./
RUN npm run build

# ===== Stage 3: Minimal runtime =====
FROM debian:bookworm-slim AS runtime

RUN apt-get update && apt-get install -y \
    ca-certificates \
    libssl3 \
    && rm -rf /var/lib/apt/lists/*

RUN useradd -ms /bin/bash appuser

WORKDIR /app

# Copy backend binary
COPY --from=backend-builder /app/target/release/ai-synth-backend .
# Copy migrations for runtime migration
COPY --from=backend-builder /app/migrations/ migrations/
# Copy frontend static files
COPY --from=frontend-builder /app/frontend/dist/ static/

# Create data directory for SQLite
RUN mkdir -p /app/data && chown appuser:appuser /app/data

USER appuser

ENV DATABASE_URL="sqlite:///app/data/ai_synth.db"
ENV STATIC_DIR="/app/static"
ENV PORT=8080

EXPOSE 8080

# Run migrations on startup, then start server
CMD ["./ai-synth-backend"]
```

The Rust backend serves the static SolidJS files directly (via `tower-http::ServeDir`), eliminating the need for a separate nginx container. All `/api/*` routes go to handlers; everything else serves `index.html` (SPA fallback).

### 7.2 docker-compose.yml

```yaml
version: "3.9"

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: ai-synth
    restart: unless-stopped
    ports:
      - "${PORT:-8080}:8080"
    volumes:
      - ai_synth_data:/app/data        # SQLite persistence
    environment:
      - DATABASE_URL=sqlite:///app/data/ai_synth.db
      - PORT=8080
      - ADMIN_EMAIL=${ADMIN_EMAIL}
      - SESSION_SECRET=${SESSION_SECRET}     # 64-byte hex for cookie signing
      - SMTP_HOST=${SMTP_HOST}
      - SMTP_PORT=${SMTP_PORT:-587}
      - SMTP_USER=${SMTP_USER}
      - SMTP_PASSWORD=${SMTP_PASSWORD}
      - SMTP_FROM=${SMTP_FROM}
      - CAPTCHA_SECRET=${CAPTCHA_SECRET}
      - CAPTCHA_SITE_KEY=${CAPTCHA_SITE_KEY}
      - ENCRYPTION_KEY=${ENCRYPTION_KEY}     # 32-byte hex for API key encryption
      - RUST_LOG=info
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/api/v1/health"]
      interval: 30s
      timeout: 5s
      retries: 3

  # Optional: Mailpit for local development (SMTP catch-all)
  mailpit:
    image: axllent/mailpit
    container_name: ai-synth-mail
    restart: unless-stopped
    ports:
      - "8025:8025"   # Web UI
      - "1025:1025"   # SMTP
    profiles:
      - dev

volumes:
  ai_synth_data:
    driver: local
```

### 7.3 Volume Mounts for SQLite

The SQLite database file is stored in a Docker named volume (`ai_synth_data`). This ensures:
- Data persists across container restarts and rebuilds
- The volume can be backed up via `docker cp` or volume backup tools
- WAL mode is used for concurrent read/write performance

**Important SQLite configuration for production:**
```rust
let pool = SqlitePoolOptions::new()
    .max_connections(5)         // SQLite handles limited concurrency
    .after_connect(|conn, _| {
        Box::pin(async move {
            conn.execute("PRAGMA journal_mode=WAL").await?;
            conn.execute("PRAGMA synchronous=NORMAL").await?;
            conn.execute("PRAGMA foreign_keys=ON").await?;
            conn.execute("PRAGMA busy_timeout=5000").await?;
            Ok(())
        })
    })
    .connect(&database_url)
    .await?;
```

### 7.4 Environment Variable Configuration

A `.env.example` file documents all required and optional variables:

```env
# === Required ===
DATABASE_URL=sqlite:///app/data/ai_synth.db
SESSION_SECRET=<64-byte-hex-string>
ENCRYPTION_KEY=<32-byte-hex-string>
ADMIN_EMAIL=admin@example.com

# === SMTP (required for magic link auth) ===
SMTP_HOST=smtp.example.com
SMTP_PORT=587
SMTP_USER=user@example.com
SMTP_PASSWORD=password
SMTP_FROM=noreply@example.com

# === Captcha ===
CAPTCHA_SECRET=<turnstile-secret-key>
CAPTCHA_SITE_KEY=<turnstile-site-key>

# === Optional ===
PORT=8080
RUST_LOG=info
BASE_URL=https://your-domain.com  # For magic link URLs
```

---

## 8. Migration from Firebase

### 8.1 Data Migration Strategy

A standalone Rust CLI tool (or a script using `firebase-admin` SDK in Python/Node) handles the migration:

**Step 1: Export Firestore data**

Use `firebase-admin` SDK (Python or Node.js is simplest for this one-shot task):

```python
# migrate_export.py
import firebase_admin
from firebase_admin import credentials, firestore
import json

cred = credentials.Certificate("service-account.json")
firebase_admin.initialize_app(cred)
db = firestore.client()

# Export users (from Firebase Auth)
# Export syntheses, sources, settings collections
data = {
    "syntheses": [],
    "sources": [],
    "settings": [],
}

for doc in db.collection("syntheses").stream():
    d = doc.to_dict()
    d["_id"] = doc.id
    data["syntheses"].append(d)

# ... same for sources, settings

with open("firebase_export.json", "w") as f:
    json.dump(data, f, default=str)
```

**Step 2: Transform and import into SQLite**

A Rust CLI tool reads the JSON export and inserts into SQLite:

```
cargo run --bin migrate -- --input firebase_export.json --db ai_synth.db
```

Key transformations:
- `authorUid` / `userId` from Firebase Auth UID -> new UUID in `users` table (mapping table maintained during migration)
- Firebase `Timestamp` -> ISO 8601 string
- Legacy `SynthesisData` fields (`majorAnnouncements`, `financialSector`, etc.) -> normalized `sections[]` JSON
- Settings doc ID (was `{userId}` in Firestore) -> `user_id` foreign key

**Step 3: User notification**

Since authentication changes from Google SSO to email+magic link, existing users need to be notified that they must use the magic link flow. Their email addresses (from Firebase Auth) are imported into the `users` table. On first magic link login, the user's existing data is accessible via their email.

### 8.2 Mapping Firestore Security Rules to Rust

The Firestore rules enforce three categories of protection that map to backend patterns:

| Firestore Rule | Rust Equivalent |
|---|---|
| `isAuthenticated()` | Auth middleware layer (rejects 401 if no valid session) |
| `isDocOwner()` / `request.auth.uid == resource.data.authorUid` | Query-level filtering: `WHERE user_id = $1` with the authenticated user's ID |
| `isValidSynthesis()` / `isValidSettings()` / `isValidSource()` | Request validation using `validator` crate or manual checks in handlers |
| `uidUnchanged()` / `uidNotModified()` | Not applicable -- `user_id` is never in the request body; it is injected server-side from the session |
| `request.resource.data.createdAt == resource.data.createdAt` | `created_at` is set server-side and never updatable via API |
| Field type checks (string, number, timestamp) | Serde deserialization + custom validators |
| Size limits (e.g., `title.size() < 500`) | Validator annotations: `#[validate(length(max = 500))]` |

**Example validation in Rust:**

```rust
#[derive(Deserialize, Validate)]
pub struct CreateSourceRequest {
    #[validate(length(min = 1, max = 200))]
    pub title: String,

    #[validate(url, length(max = 1000))]
    pub url: String,
}
```

The key architectural difference: in Firestore, rules are the *only* security layer (the client has direct DB access). In the Rust backend, security is enforced at the handler level (authentication middleware + query scoping + input validation). The database is never directly accessible from the client.

**Ownership enforcement pattern:**

Every query that reads or mutates user data includes `WHERE user_id = ?` with the authenticated user's ID. This is not a "rule" but a structural guarantee -- there is no code path that can access another user's data because the user ID comes from the session, not the request.

```rust
// db/syntheses.rs
pub async fn get_by_id(pool: &SqlitePool, user_id: &str, synthesis_id: &str) -> Result<Option<Synthesis>, sqlx::Error> {
    sqlx::query_as!(
        Synthesis,
        "SELECT * FROM syntheses WHERE id = ? AND user_id = ?",
        synthesis_id,
        user_id
    )
    .fetch_optional(pool)
    .await
}
```

If the synthesis belongs to another user, this returns `None`, and the handler returns 404. There is no way for a user to query, update, or delete another user's data.

---

## Summary of Key Crate Dependencies

| Purpose | Crate | Version Guidance |
|---|---|---|
| Web framework | `axum` | ^0.8 |
| Async runtime | `tokio` | ^1 (full features) |
| Database | `sqlx` | ^0.8 (features: sqlite, runtime-tokio) |
| HTTP client | `reqwest` | ^0.12 (features: json, cookies) |
| HTML parsing | `scraper` | ^0.22 |
| Serialization | `serde`, `serde_json` | ^1 |
| Date/time | `chrono` | ^0.4 |
| Password/token hashing | `sha2` | ^0.10 |
| Random tokens | `rand` | ^0.8 |
| SMTP | `lettre` | ^0.11 |
| Logging | `tracing`, `tracing-subscriber` | ^0.1 / ^0.3 |
| Config | `dotenvy` | ^0.15 |
| Validation | `validator` | ^0.19 |
| Concurrent map | `dashmap` | ^6 |
| Static file serving | `tower-http` | ^0.6 (features: fs, cors, trace) |
| Cookie handling | `axum-extra` | ^0.10 (features: cookie) |
| Encryption (API keys) | `aes-gcm` | ^0.10 |
| Base64 | `base64` | ^0.22 |
| UUID | `uuid` | ^1 (features: v4) |
| Error handling | `anyhow`, `thiserror` | ^1 |

---

## Architecture Diagram (Text)

```
                                   ┌─────────────────────┐
                                   │   Docker Container   │
                                   │                     │
  Browser ◄──── HTTPS ────►  ┌─────┴─────────────────┐   │
  (SolidJS SPA)               │    Axum Web Server     │   │
                              │                       │   │
                              │  /static/* ──► ServeDir│   │
                              │  /api/v1/* ──► Router  │   │
                              │                       │   │
                              │  ┌─ Auth Middleware ─┐ │   │
                              │  │  Session Cookie   │ │   │
                              │  │  CSRF Check       │ │   │
                              │  └───────────────────┘ │   │
                              │                       │   │
                              │  ┌─ Handlers ────────┐ │   │
                              │  │ auth, syntheses,  │ │   │
                              │  │ sources, settings,│ │   │
                              │  │ admin, email      │ │   │
                              │  └────────┬──────────┘ │   │
                              │           │            │   │
                              │  ┌─ Services ────────┐ │   │
                              │  │ LLM providers     │─┼───┼──► Gemini API
                              │  │ (trait-based)     │─┼───┼──► OpenAI API
                              │  │                   │─┼───┼──► Anthropic API
                              │  │ Scraper (reqwest) │─┼───┼──► Target URLs
                              │  │ Email (lettre)    │─┼───┼──► SMTP Server
                              │  │ Captcha           │─┼───┼──► Turnstile API
                              │  └────────┬──────────┘ │   │
                              │           │            │   │
                              │  ┌─ DB Layer (sqlx) ─┐ │   │
                              │  │  SQLite (WAL)     │ │   │
                              │  └───────────────────┘ │   │
                              └───────────┬────────────┘   │
                                          │                │
                              ┌───────────▼────────────┐   │
                              │   /app/data/            │   │
                              │   ai_synth.db           │   │
                              │   (Docker volume)       │   │
                              └─────────────────────────┘   │
                                   └─────────────────────┘
```