# Infrastructure & Testing Plan: AI Weekly Synth Rewrite **Role**: Infrastructure & Testing Planner **Date**: 2026-03-21 **Inputs**: Team analysis (00-04), project decisions (05), CLAUDE.md --- ## Table of Contents 1. [Docker Setup](#1-docker-setup) 2. [Environment Configuration](#2-environment-configuration) 3. [Database Management](#3-database-management) 4. [CLI Commands](#4-cli-commands) 5. [Reverse Proxy & TLS](#5-reverse-proxy--tls) 6. [Testing Strategy](#6-testing-strategy) 7. [CI/CD Pipeline](#7-cicd-pipeline) 8. [Monitoring & Observability](#8-monitoring--observability) 9. [Documentation](#9-documentation) 10. [Security Hardening Checklist](#10-security-hardening-checklist) --- ## 1. Docker Setup ### 1.1 Dockerfile (Production Multi-Stage Build) The build uses three stages: Rust backend compilation with `cargo-chef` for layer caching, SolidJS frontend build, and a minimal Debian slim runtime. ```dockerfile # =================================================================== # Stage 0: cargo-chef planner (captures dependency info for caching) # =================================================================== FROM rust:1.85-bookworm AS chef RUN cargo install cargo-chef --locked WORKDIR /app # =================================================================== # Stage 1a: Prepare the recipe (dependency snapshot) # =================================================================== FROM chef AS planner COPY Cargo.toml Cargo.lock ./ COPY src/ src/ COPY migrations/ migrations/ RUN cargo chef prepare --recipe-path recipe.json # =================================================================== # Stage 1b: Build Rust backend dependencies (cached layer) # =================================================================== FROM chef AS backend-deps COPY --from=planner /app/recipe.json recipe.json # Build only dependencies - this layer is cached unless Cargo.toml/lock changes RUN cargo chef cook --release --recipe-path recipe.json # =================================================================== # Stage 1c: Build the actual Rust backend # =================================================================== FROM backend-deps AS backend-builder COPY Cargo.toml Cargo.lock ./ COPY src/ src/ COPY migrations/ migrations/ # sqlx offline mode: use pre-generated query metadata (no live DB needed) ENV SQLX_OFFLINE=true COPY sqlx-data.json ./ RUN cargo build --release # =================================================================== # Stage 2: Build SolidJS frontend # =================================================================== FROM node:22-alpine AS frontend-builder WORKDIR /app/frontend COPY frontend/package.json frontend/package-lock.json ./ RUN npm ci --ignore-scripts COPY frontend/ ./ RUN npm run build # =================================================================== # Stage 3: Minimal runtime image # =================================================================== FROM debian:bookworm-slim AS runtime # Install only what the binary needs at runtime RUN apt-get update \ && apt-get install -y --no-install-recommends \ ca-certificates \ libssl3 \ curl \ && rm -rf /var/lib/apt/lists/* # Create non-root user RUN groupadd --system appuser \ && useradd --system --gid appuser --home-dir /app --no-create-home --shell /usr/sbin/nologin appuser WORKDIR /app # Copy backend binary COPY --from=backend-builder /app/target/release/ai-weekly-synth ./ai-weekly-synth # Copy migrations (run at startup) COPY --from=backend-builder /app/migrations/ ./migrations/ # Copy frontend static files COPY --from=frontend-builder /app/frontend/dist/ ./static/ # Set ownership RUN chown -R appuser:appuser /app USER appuser ENV PORT=8080 EXPOSE 8080 HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ CMD curl -f http://localhost:8080/api/v1/health || exit 1 ENTRYPOINT ["./ai-weekly-synth"] CMD ["serve"] ``` **Key design decisions:** - **cargo-chef**: Separates dependency compilation from source compilation. Changing a `.rs` file only rebuilds the final stage, not all dependencies. This typically saves 3-8 minutes per build. - **SQLX_OFFLINE=true**: Uses pre-generated `sqlx-data.json` so the build does not need a live database connection. Developers generate this file locally with `cargo sqlx prepare`. - **curl in runtime**: Required for the `HEALTHCHECK` instruction. Adds approximately 3 MB to the image. - **Non-root user**: `appuser` has no shell, no home directory, and no login capability. - **ENTRYPOINT + CMD split**: Allows running CLI subcommands (e.g., `docker run ai-weekly-synth create-admin admin@example.com`) by overriding CMD. ### 1.2 .dockerignore ``` target/ node_modules/ .git/ .env .env.* !.env.example *.md docs/ tests/ frontend/node_modules/ frontend/dist/ **/*.swp **/*.swo .DS_Store ``` ### 1.3 docker-compose.yml (Production) ```yaml services: app: build: context: . dockerfile: Dockerfile container_name: ai-synth restart: unless-stopped env_file: .env environment: - DATABASE_URL=postgres://ai_synth:${POSTGRES_PASSWORD}@db:5432/ai_synth depends_on: db: condition: service_healthy networks: - internal # Port is NOT exposed to host; Caddy handles external traffic expose: - "8080" read_only: true tmpfs: - /tmp:noexec,nosuid,size=64M security_opt: - no-new-privileges:true cap_drop: - ALL healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/api/v1/health"] interval: 30s timeout: 5s start_period: 15s retries: 3 db: image: postgres:17-alpine container_name: ai-synth-db restart: unless-stopped environment: POSTGRES_USER: ai_synth POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} POSTGRES_DB: ai_synth volumes: - postgres_data:/var/lib/postgresql/data networks: - internal # NOT exposed to host in production healthcheck: test: ["CMD-SHELL", "pg_isready -U ai_synth -d ai_synth"] interval: 10s timeout: 5s start_period: 10s retries: 5 shm_size: 128mb caddy: image: caddy:2-alpine container_name: ai-synth-caddy restart: unless-stopped ports: - "80:80" - "443:443" - "443:443/udp" # HTTP/3 volumes: - ./Caddyfile:/etc/caddy/Caddyfile:ro - caddy_data:/data - caddy_config:/config depends_on: app: condition: service_healthy networks: - internal networks: internal: driver: bridge volumes: postgres_data: driver: local caddy_data: driver: local caddy_config: driver: local ``` **Key design decisions:** - **Internal network only**: The app and database communicate on an internal Docker bridge network. Only Caddy exposes ports 80/443 to the host. - **read_only + tmpfs**: The app container's filesystem is read-only. `/tmp` is mounted as tmpfs for any transient file needs. - **Postgres health check**: The app service waits for Postgres to be ready before starting (`condition: service_healthy`). - **shm_size**: Postgres uses shared memory; 128 MB is sufficient for this workload. - **No `version` key**: Docker Compose v2+ no longer requires the `version` field. ### 1.4 docker-compose.dev.yml (Development Override) Run with: `docker compose -f docker-compose.yml -f docker-compose.dev.yml up` ```yaml services: app: build: target: backend-deps # Stop at the deps stage; source is mounted command: ["cargo", "watch", "-x", "run -- serve"] volumes: - ./Cargo.toml:/app/Cargo.toml - ./Cargo.lock:/app/Cargo.lock - ./src:/app/src - ./migrations:/app/migrations - cargo_cache:/usr/local/cargo/registry - target_cache:/app/target environment: - DATABASE_URL=postgres://ai_synth:devpassword@db:5432/ai_synth - RUST_LOG=debug,ai_weekly_synth=trace - RUST_BACKTRACE=1 - APP_URL=http://localhost:3000 read_only: false security_opt: [] cap_drop: [] ports: - "8080:8080" # Expose API directly for debugging frontend: image: node:22-alpine container_name: ai-synth-frontend-dev working_dir: /app/frontend command: ["npm", "run", "dev", "--", "--host", "0.0.0.0"] volumes: - ./frontend:/app/frontend - frontend_node_modules:/app/frontend/node_modules ports: - "3000:3000" environment: - VITE_API_URL=http://localhost:8080 networks: - internal db: ports: - "5432:5432" # Expose Postgres for direct access with psql/pgAdmin environment: POSTGRES_PASSWORD: devpassword mailpit: image: axllent/mailpit:latest container_name: ai-synth-mailpit restart: unless-stopped ports: - "8025:8025" # Mailpit web UI - "1025:1025" # SMTP networks: - internal caddy: # Disable Caddy in dev; frontend dev server handles its own port profiles: - production-only volumes: cargo_cache: driver: local target_cache: driver: local frontend_node_modules: driver: local ``` **Development workflow:** - **Backend**: `cargo-watch` recompiles and restarts the Rust server on any file change in `src/`. The cargo registry and target directory are cached in named volumes so rebuilds are fast. - **Frontend**: Vite dev server runs separately on port 3000 with HMR. API requests are proxied to `localhost:8080` via the Vite config's `proxy` setting. - **Database**: Postgres port 5432 is exposed to the host so developers can connect with `psql`, pgAdmin, or DBeaver. - **Email**: Mailpit catches all outgoing emails on SMTP port 1025 and provides a web UI at `http://localhost:8025` for inspecting magic link emails. - **Caddy**: Disabled in development (no TLS needed locally). ### 1.5 Simplified Development Without Docker For developers who prefer running outside Docker, a minimal setup is also possible: ```bash # Terminal 1: Postgres (still in Docker for convenience) docker run -d --name ai-synth-pg \ -e POSTGRES_USER=ai_synth \ -e POSTGRES_PASSWORD=devpassword \ -e POSTGRES_DB=ai_synth \ -p 5432:5432 \ postgres:17-alpine # Terminal 2: Rust backend with cargo-watch cargo install cargo-watch sqlx-cli export DATABASE_URL="postgres://ai_synth:devpassword@localhost:5432/ai_synth" sqlx database create && sqlx migrate run cargo watch -x 'run -- serve' # Terminal 3: Frontend cd frontend && npm install && npm run dev # Terminal 4: Mailpit docker run -d --name ai-synth-mailpit -p 8025:8025 -p 1025:1025 axllent/mailpit ``` --- ## 2. Environment Configuration ### 2.1 Complete Environment Variable Reference | Variable | Description | Required | Default | Example | |---|---|---|---|---| | **Database** | | | | | | `DATABASE_URL` | Postgres connection string | Yes | -- | `postgres://ai_synth:s3cret@db:5432/ai_synth` | | **Security** | | | | | | `MASTER_ENCRYPTION_KEY` | 256-bit hex key for encrypting user API keys at rest (AES-256-GCM) | Yes | -- | `a1b2c3...` (64 hex chars) | | `SESSION_SECRET` | Secret for signing session cookies (HMAC) | Yes | -- | `e4f5a6...` (64 hex chars) | | **Application** | | | | | | `APP_URL` | Public URL of the app (used in magic link emails, CORS origin) | Yes | -- | `https://synth.example.com` | | `PORT` | HTTP port the Rust server listens on | No | `8080` | `8080` | | `RUST_LOG` | Logging level filter (tracing-subscriber) | No | `info` | `info,ai_weekly_synth=debug` | | `STATIC_DIR` | Path to the built SolidJS frontend files | No | `./static` | `/app/static` | | **Email (Resend)** | | | | | | `RESEND_API_KEY` | Resend API key for transactional email (magic links, synthesis delivery) | Yes | -- | `re_abc123...` | | `EMAIL_FROM` | Sender address for outgoing emails | Yes | -- | `AI Weekly Synth ` | | **Captcha (Cloudflare Turnstile)** | | | | | | `TURNSTILE_SECRET_KEY` | Turnstile server-side secret key | Yes | -- | `0x4AAAAAAA...` | | `TURNSTILE_SITE_KEY` | Turnstile client-side site key (passed to frontend) | Yes | -- | `0x4BBBBBB...` | | **Postgres (docker-compose only)** | | | | | | `POSTGRES_PASSWORD` | Password for the Postgres `ai_synth` user | Yes | -- | `strong-random-password` | ### 2.2 .env.example ```env # ============================================================================== # AI Weekly Synth - Environment Configuration # ============================================================================== # Copy this file to .env and fill in the values. # NEVER commit .env to version control. # ============================================================================== # --- Database --- # Connection string for Postgres. In docker-compose, the hostname is "db". DATABASE_URL=postgres://ai_synth:CHANGE_ME@db:5432/ai_synth POSTGRES_PASSWORD=CHANGE_ME # --- Security --- # 256-bit key for encrypting user LLM API keys at rest (64 hex characters). # Generate with: openssl rand -hex 32 MASTER_ENCRYPTION_KEY= # Secret for signing session cookies (64 hex characters). # Generate with: openssl rand -hex 32 SESSION_SECRET= # --- Application --- # Public URL where the app is accessible (no trailing slash). # Used for magic link URLs, CORS origin, and cookie domain. APP_URL=https://synth.example.com # Port for the backend HTTP server (inside the container). PORT=8080 # Logging level. Options: error, warn, info, debug, trace. # Format: "level" or "level,crate=level" RUST_LOG=info,ai_weekly_synth=debug # --- Email (Resend) --- # Sign up at https://resend.com and create an API key. RESEND_API_KEY=re_CHANGE_ME # Sender address. Must be a verified domain in Resend. EMAIL_FROM=AI Weekly Synth # --- Captcha (Cloudflare Turnstile) --- # Sign up at https://dash.cloudflare.com/turnstile TURNSTILE_SECRET_KEY=0x4AAAAAAA_CHANGE_ME TURNSTILE_SITE_KEY=0x4BBBBBB_CHANGE_ME ``` ### 2.3 Configuration Loading in Rust Use a typed configuration struct with `dotenvy` for `.env` file loading and `serde` for deserialization from environment variables via the `envy` crate. ```rust // src/config.rs use secrecy::{ExposeSecret, SecretString}; use serde::Deserialize; #[derive(Debug, Deserialize, Clone)] pub struct AppConfig { // Database pub database_url: SecretString, // Security pub master_encryption_key: SecretString, pub session_secret: SecretString, // Application pub app_url: String, #[serde(default = "default_port")] pub port: u16, #[serde(default = "default_static_dir")] pub static_dir: String, // Email (Resend) pub resend_api_key: SecretString, pub email_from: String, // Captcha (Turnstile) pub turnstile_secret_key: SecretString, pub turnstile_site_key: String, } fn default_port() -> u16 { 8080 } fn default_static_dir() -> String { "./static".to_string() } impl AppConfig { pub fn from_env() -> Result { dotenvy::dotenv().ok(); // Load .env file if present (not an error if missing) envy::from_env::() } /// Validate that secrets meet minimum requirements pub fn validate(&self) -> Result<(), String> { let key = self.master_encryption_key.expose_secret(); if key.len() != 64 || !key.chars().all(|c| c.is_ascii_hexdigit()) { return Err("MASTER_ENCRYPTION_KEY must be exactly 64 hex characters".into()); } let secret = self.session_secret.expose_secret(); if secret.len() < 64 { return Err("SESSION_SECRET must be at least 64 characters".into()); } if !self.app_url.starts_with("http://") && !self.app_url.starts_with("https://") { return Err("APP_URL must start with http:// or https://".into()); } Ok(()) } } ``` **Design notes:** - `SecretString` from the `secrecy` crate prevents accidental logging of secrets via `Debug` or `Display` traits. The secret is only accessible via `.expose_secret()`. - `dotenvy::dotenv().ok()` loads `.env` if present but does not fail if the file is missing (environment variables can be injected by Docker). - `validate()` is called at startup. The app refuses to start with invalid configuration rather than failing at runtime. --- ## 3. Database Management ### 3.1 Migration Tooling **Tool**: `sqlx-cli` (the official SQLX command-line tool). ```bash # Install sqlx-cli with Postgres support cargo install sqlx-cli --no-default-features --features postgres # Create the database (if it does not exist) sqlx database create # Run all pending migrations sqlx migrate run # Revert the last migration sqlx migrate revert # Check migration status sqlx migrate info # Prepare offline query data (for Docker builds without a live DB) cargo sqlx prepare ``` ### 3.2 Migration Execution Strategy **Migrations run at application startup** before the HTTP server begins accepting requests. ```rust // In main.rs, during initialization async fn run_migrations(pool: &PgPool) -> Result<(), sqlx::Error> { tracing::info!("Running database migrations..."); sqlx::migrate!("./migrations") .run(pool) .await?; tracing::info!("Migrations complete."); Ok(()) } ``` **Rationale**: For a single-instance self-hosted application, running migrations at startup is simple and reliable. There is no risk of multiple instances racing to apply migrations. If a migration fails, the application exits with an error before serving traffic. **Safeguard**: The `sqlx::migrate!` macro embeds migration files at compile time. If migrations are missing or corrupted, the build fails. ### 3.3 Migration Files (Postgres) All migrations live in `migrations/` with the naming convention `YYYYMMDDHHMMSS_description.sql`. ```sql -- migrations/20260321000001_create_users.sql CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email TEXT NOT NULL UNIQUE, display_name TEXT, role TEXT NOT NULL DEFAULT 'user' CHECK (role IN ('user', 'admin')), created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_users_email ON users(email); ``` ```sql -- migrations/20260321000002_create_sessions.sql CREATE TABLE sessions ( session_hash TEXT PRIMARY KEY, -- SHA-256(session_id) user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), expires_at TIMESTAMPTZ NOT NULL, last_active_at TIMESTAMPTZ NOT NULL DEFAULT now(), ip_address TEXT, user_agent TEXT ); CREATE INDEX idx_sessions_user_id ON sessions(user_id); CREATE INDEX idx_sessions_expires_at ON sessions(expires_at); ``` ```sql -- migrations/20260321000003_create_magic_tokens.sql CREATE TABLE magic_tokens ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email TEXT NOT NULL, token_hash TEXT NOT NULL UNIQUE, -- SHA-256(token) created_at TIMESTAMPTZ NOT NULL DEFAULT now(), expires_at TIMESTAMPTZ NOT NULL, used BOOLEAN NOT NULL DEFAULT false ); CREATE INDEX idx_magic_tokens_email ON magic_tokens(email); CREATE INDEX idx_magic_tokens_expires ON magic_tokens(expires_at); ``` ```sql -- migrations/20260321000004_create_settings.sql CREATE TABLE settings ( user_id UUID PRIMARY KEY REFERENCES users(id) ON DELETE CASCADE, theme TEXT NOT NULL DEFAULT 'Intelligence Artificielle', max_age_days INTEGER NOT NULL DEFAULT 7 CHECK (max_age_days BETWEEN 1 AND 365), categories JSONB NOT NULL DEFAULT '["Annonces majeures", "Recherche et innovation", "Industrie et entreprises", "Secteur public", "Opinions et analyses"]'::jsonb, max_items_per_category INTEGER NOT NULL DEFAULT 4 CHECK (max_items_per_category BETWEEN 1 AND 20), search_agent_behavior TEXT NOT NULL DEFAULT '', ai_provider TEXT NOT NULL DEFAULT 'gemini', ai_model TEXT NOT NULL DEFAULT 'gemini-2.5-flash', updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); ``` ```sql -- migrations/20260321000005_create_sources.sql CREATE TABLE sources ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, title TEXT NOT NULL, url TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_sources_user_id ON sources(user_id); CREATE INDEX idx_sources_user_created ON sources(user_id, created_at DESC); ``` ```sql -- migrations/20260321000006_create_syntheses.sql CREATE TABLE syntheses ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, week TEXT NOT NULL, -- e.g. "2026-W12" sections JSONB NOT NULL, -- [{ title, items: [{ title, url, summary }] }] status TEXT NOT NULL DEFAULT 'completed' CHECK (status IN ('pending', 'in_progress', 'completed', 'failed')), error_message TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_syntheses_user_id ON syntheses(user_id); CREATE INDEX idx_syntheses_user_created ON syntheses(user_id, created_at DESC); ``` ```sql -- migrations/20260321000007_create_llm_providers.sql -- Admin-configured LLM provider catalog (provider + available models) CREATE TABLE llm_providers ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), provider TEXT NOT NULL UNIQUE CHECK (provider IN ('gemini', 'openai', 'anthropic')), display_name TEXT NOT NULL, models JSONB NOT NULL, -- ["gemini-2.5-flash", "gemini-2.5-pro"] is_enabled BOOLEAN NOT NULL DEFAULT true, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); ``` ```sql -- migrations/20260321000008_create_user_api_keys.sql -- Users bring their own API keys (encrypted at rest) CREATE TABLE user_api_keys ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE, provider TEXT NOT NULL, -- 'gemini', 'openai', 'anthropic' encrypted_key BYTEA NOT NULL, -- AES-256-GCM ciphertext nonce BYTEA NOT NULL, -- 12-byte GCM nonce key_prefix TEXT NOT NULL, -- First 6 chars for display ("sk-pr...") created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE(user_id, provider) ); CREATE INDEX idx_user_api_keys_user ON user_api_keys(user_id); ``` ```sql -- migrations/20260321000009_create_rate_limit_config.sql CREATE TABLE rate_limit_config ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), provider TEXT NOT NULL UNIQUE REFERENCES llm_providers(provider), max_requests INTEGER NOT NULL DEFAULT 29, time_window_secs INTEGER NOT NULL DEFAULT 60, updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); ``` ```sql -- migrations/20260321000010_create_audit_log.sql CREATE TABLE audit_log ( id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, timestamp TIMESTAMPTZ NOT NULL DEFAULT now(), user_id UUID NOT NULL REFERENCES users(id), action TEXT NOT NULL, target_type TEXT NOT NULL, target_id UUID, details JSONB, ip_address TEXT, user_agent TEXT ); CREATE INDEX idx_audit_timestamp ON audit_log(timestamp); CREATE INDEX idx_audit_user ON audit_log(user_id); ``` ### 3.4 Backup Strategy For a single-tenant self-hosted Postgres running in Docker Compose. **Automated pg_dump via cron on the host:** ```bash #!/usr/bin/env bash # /opt/ai-synth/backup.sh # Run via cron: 0 3 * * * /opt/ai-synth/backup.sh set -euo pipefail BACKUP_DIR="/opt/ai-synth/backups" RETENTION_DAYS=30 TIMESTAMP=$(date +%Y%m%d_%H%M%S) BACKUP_FILE="${BACKUP_DIR}/ai_synth_${TIMESTAMP}.sql.gz" mkdir -p "${BACKUP_DIR}" # Dump and compress docker exec ai-synth-db pg_dump -U ai_synth -d ai_synth \ --format=custom \ --compress=6 \ > "${BACKUP_FILE}" # Verify backup is not empty if [ ! -s "${BACKUP_FILE}" ]; then echo "ERROR: Backup file is empty!" >&2 exit 1 fi # Remove backups older than RETENTION_DAYS find "${BACKUP_DIR}" -name "ai_synth_*.sql.gz" -mtime +${RETENTION_DAYS} -delete echo "Backup completed: ${BACKUP_FILE} ($(du -h "${BACKUP_FILE}" | cut -f1))" ``` **Crontab entry:** ``` 0 3 * * * /opt/ai-synth/backup.sh >> /var/log/ai-synth-backup.log 2>&1 ``` **Restore procedure:** ```bash # Stop the app (to prevent writes during restore) docker compose stop app # Restore from backup docker exec -i ai-synth-db pg_restore \ -U ai_synth -d ai_synth \ --clean --if-exists \ < /opt/ai-synth/backups/ai_synth_20260321_030000.sql.gz # Restart the app docker compose start app ``` **Docker volume backup (alternative):** ```bash # Full volume backup (filesystem level) docker run --rm \ -v ai-synth_postgres_data:/data:ro \ -v /opt/ai-synth/backups:/backup \ alpine tar czf /backup/pg_volume_$(date +%Y%m%d).tar.gz -C /data . ``` ### 3.5 Seed Data After the first deployment and migration, the admin creates the provider catalog. However, a seed command is useful for initial setup. ```rust // CLI: ai-weekly-synth seed-providers // Inserts default provider entries (without API keys -- users provide their own) async fn seed_providers(pool: &PgPool) -> Result<()> { let providers = vec![ ("gemini", "Google Gemini", serde_json::json!([ "gemini-2.5-pro", "gemini-2.5-flash", "gemini-2.0-flash" ])), ("openai", "OpenAI", serde_json::json!([ "gpt-4o", "gpt-4o-mini", "o3-mini" ])), ("anthropic", "Anthropic", serde_json::json!([ "claude-sonnet-4-20250514", "claude-haiku-3-5-20241022" ])), ]; for (provider, display_name, models) in providers { sqlx::query!( r#"INSERT INTO llm_providers (provider, display_name, models) VALUES ($1, $2, $3) ON CONFLICT (provider) DO UPDATE SET display_name = EXCLUDED.display_name, models = EXCLUDED.models, updated_at = now()"#, provider, display_name, models ) .execute(pool) .await?; } Ok(()) } ``` --- ## 4. CLI Commands The binary uses `clap` for subcommand parsing. The default subcommand is `serve`. ### 4.1 CLI Structure ```rust // src/main.rs use clap::{Parser, Subcommand}; #[derive(Parser)] #[command(name = "ai-weekly-synth", version, about = "AI Weekly Synth server")] struct Cli { #[command(subcommand)] command: Option, } #[derive(Subcommand)] enum Commands { /// Start the web server (default) Serve, /// Create an admin user account CreateAdmin { /// Email address for the admin account email: String, }, /// Run database migrations Migrate, /// Seed the LLM provider catalog with defaults SeedProviders, /// Check application health (DB connectivity, config validity) HealthCheck, /// Re-encrypt all user API keys with a new master key RotateEncryptionKey { /// The new master encryption key (64 hex characters) new_key: String, }, } #[tokio::main] async fn main() -> Result<()> { let cli = Cli::parse(); match cli.command.unwrap_or(Commands::Serve) { Commands::Serve => run_server().await, Commands::CreateAdmin { email } => create_admin(&email).await, Commands::Migrate => run_migrations_cli().await, Commands::SeedProviders => seed_providers_cli().await, Commands::HealthCheck => health_check_cli().await, Commands::RotateEncryptionKey { new_key } => rotate_key(&new_key).await, } } ``` ### 4.2 create-admin Command Creates an admin user directly in the database. The user can then request a magic link to log in. ```rust async fn create_admin(email: &str) -> Result<()> { let config = AppConfig::from_env()?; let pool = PgPool::connect(config.database_url.expose_secret()).await?; // Validate email format if !email.contains('@') || email.len() > 255 { anyhow::bail!("Invalid email address: {}", email); } // Check if user already exists let existing = sqlx::query!("SELECT id, role FROM users WHERE email = $1", email) .fetch_optional(&pool) .await?; match existing { Some(user) if user.role == "admin" => { println!("User {} is already an admin.", email); } Some(user) => { // Promote existing user to admin sqlx::query!("UPDATE users SET role = 'admin', updated_at = now() WHERE id = $1", user.id) .execute(&pool) .await?; println!("User {} promoted to admin.", email); } None => { // Create new admin user sqlx::query!( r#"INSERT INTO users (email, role) VALUES ($1, 'admin')"#, email ) .execute(&pool) .await?; println!("Admin user created: {}", email); println!("The user can now log in by requesting a magic link at the login page."); } } Ok(()) } ``` **Usage:** ```bash # In Docker docker exec ai-synth ./ai-weekly-synth create-admin admin@example.com # Or using docker compose run (for first-time setup before the app is running) docker compose run --rm app ./ai-weekly-synth create-admin admin@example.com ``` ### 4.3 Other CLI Commands **migrate**: Runs pending migrations. Useful for manual control or CI pipelines. ```bash docker compose run --rm app ./ai-weekly-synth migrate ``` **seed-providers**: Populates the LLM provider catalog with sensible defaults. ```bash docker compose run --rm app ./ai-weekly-synth seed-providers ``` **health-check**: Verifies DB connectivity, config validity, and Resend API key. Returns exit code 0 on success, 1 on failure. ```bash docker compose run --rm app ./ai-weekly-synth health-check ``` **rotate-encryption-key**: Re-encrypts all user API keys with a new master key. Used for key rotation. ```bash docker compose run --rm app ./ai-weekly-synth rotate-encryption-key NEW_HEX_KEY ``` --- ## 5. Reverse Proxy & TLS ### 5.1 Recommendation: Caddy **Caddy** is recommended over nginx for the following reasons: - Automatic HTTPS with Let's Encrypt (zero configuration for certificate issuance and renewal). - Automatic HTTP-to-HTTPS redirect. - HTTP/3 support out of the box. - Simpler configuration syntax. - Runs as a single static binary in an Alpine container (minimal attack surface). ### 5.2 Caddyfile ``` # Caddyfile { email {$ACME_EMAIL:admin@example.com} } {$APP_DOMAIN:synth.example.com} { # Reverse proxy to the Rust app (internal Docker network) reverse_proxy app:8080 { # Health checks so Caddy removes unhealthy backends health_uri /api/v1/health health_interval 30s health_timeout 5s } # Compression encode zstd gzip # Security headers header { # HSTS (1 year, include subdomains) Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" # Prevent MIME sniffing X-Content-Type-Options "nosniff" # Prevent clickjacking X-Frame-Options "DENY" # Referrer policy Referrer-Policy "strict-origin-when-cross-origin" # Permissions policy (disable unused APIs) Permissions-Policy "camera=(), microphone=(), geolocation=(), payment=()" # Content Security Policy Content-Security-Policy "default-src 'none'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self'; connect-src 'self'; frame-src https://challenges.cloudflare.com; frame-ancestors 'none'; base-uri 'self'; form-action 'self'" # Remove server header -Server } # Rate limiting at the reverse proxy level (defense in depth) # Caddy does not have built-in rate limiting; use the app's rate limiter. # Logging log { output stdout format json } } ``` **Notes:** - `{$APP_DOMAIN}` and `{$ACME_EMAIL}` are environment variables. In docker-compose, set these in the `.env` file or the caddy service's environment. - The `frame-src https://challenges.cloudflare.com` directive allows the Turnstile captcha widget to load in an iframe. - Caddy automatically provisions a TLS certificate from Let's Encrypt for `APP_DOMAIN`. The domain's DNS must point to the server's public IP before starting Caddy. - Caddy stores certificates in the `caddy_data` volume. They persist across container restarts and are auto-renewed. ### 5.3 Alternative: nginx Configuration If nginx is preferred for familiarity or existing infrastructure: ```nginx # /etc/nginx/conf.d/ai-synth.conf server { listen 80; server_name synth.example.com; return 301 https://$host$request_uri; } server { listen 443 ssl http2; server_name synth.example.com; # TLS (managed by certbot) ssl_certificate /etc/letsencrypt/live/synth.example.com/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/synth.example.com/privkey.pem; ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers HIGH:!aNULL:!MD5; ssl_prefer_server_ciphers on; # Security headers add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; add_header X-Content-Type-Options "nosniff" always; add_header X-Frame-Options "DENY" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; # Proxy to app location / { proxy_pass http://app:8080; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE support (generation progress) proxy_buffering off; proxy_cache off; proxy_read_timeout 120s; } # Gzip gzip on; gzip_types text/plain text/css application/json application/javascript text/xml; } ``` With nginx, TLS certificate management requires certbot: ```bash # Install certbot and obtain initial certificate docker run --rm -v certbot_data:/etc/letsencrypt -v certbot_www:/var/www/certbot \ certbot/certbot certonly --webroot -w /var/www/certbot \ -d synth.example.com --email admin@example.com --agree-tos ``` **Recommendation**: Use Caddy unless there is a specific reason to use nginx. Caddy eliminates certificate management entirely. ### 5.4 SSE Considerations The reverse proxy must support Server-Sent Events for the synthesis generation progress stream. Key settings: - **Caddy**: Works out of the box (no buffering by default for streaming responses). - **nginx**: Requires `proxy_buffering off;` and `proxy_cache off;` on the SSE endpoint. The Rust backend should set `Cache-Control: no-cache`, `Content-Type: text/event-stream`, and `Connection: keep-alive` on SSE responses. --- ## 6. Testing Strategy ### 6.1 Backend Unit Tests **What to test:** | Module | What to test | Mocking needed | |---|---|---| | `config.rs` | Validation logic (key length, URL format, required fields) | No | | `services/encryption.rs` | AES-256-GCM encrypt/decrypt roundtrip, key derivation, nonce uniqueness | No | | `services/captcha.rs` | Turnstile verification parsing (success, failure, invalid response) | Mock HTTP client | | `middleware/rate_limit.rs` | Token bucket logic: allows up to N requests, blocks after limit, resets after window | No (test in-memory state) | | `middleware/auth.rs` | Session extraction, hash verification, expiry checks | Mock DB pool | | `services/llm/types.rs` | Schema building from user categories | No | | `services/scraper.rs` | HTML parsing, date extraction, soft-404 detection, SSRF IP validation | No (test with static HTML) | | `models/*` | Serialization/deserialization, validation constraints | No | | `services/email.rs` | Email template rendering (magic link, synthesis) | Mock SMTP | **Test module structure:** Tests live alongside the code they test (Rust convention): ```rust // src/services/encryption.rs pub fn encrypt_api_key(key: &str, master_key: &[u8; 32]) -> Result<(Vec, Vec)> { // ... implementation } pub fn decrypt_api_key(ciphertext: &[u8], nonce: &[u8], master_key: &[u8; 32]) -> Result { // ... implementation } #[cfg(test)] mod tests { use super::*; #[test] fn encrypt_decrypt_roundtrip() { let master_key = [0xab; 32]; let api_key = "sk-proj-abc123def456"; let (ciphertext, nonce) = encrypt_api_key(api_key, &master_key).unwrap(); let decrypted = decrypt_api_key(&ciphertext, &nonce, &master_key).unwrap(); assert_eq!(decrypted, api_key); } #[test] fn decrypt_with_wrong_key_fails() { let master_key = [0xab; 32]; let wrong_key = [0xcd; 32]; let api_key = "sk-proj-abc123def456"; let (ciphertext, nonce) = encrypt_api_key(api_key, &master_key).unwrap(); let result = decrypt_api_key(&ciphertext, &nonce, &wrong_key); assert!(result.is_err()); } #[test] fn each_encryption_produces_unique_nonce() { let master_key = [0xab; 32]; let api_key = "sk-proj-abc123def456"; let (_, nonce1) = encrypt_api_key(api_key, &master_key).unwrap(); let (_, nonce2) = encrypt_api_key(api_key, &master_key).unwrap(); assert_ne!(nonce1, nonce2); } } ``` **Mocking strategy:** Use trait objects and the `mockall` crate for mocking external dependencies. ```rust // Define traits for mockable services #[cfg_attr(test, mockall::automock)] #[async_trait] pub trait EmailService: Send + Sync { async fn send_magic_link(&self, to: &str, token: &str) -> Result<()>; async fn send_synthesis(&self, to: &str, synthesis: &Synthesis) -> Result<()>; } #[cfg_attr(test, mockall::automock)] #[async_trait] pub trait CaptchaVerifier: Send + Sync { async fn verify(&self, token: &str) -> Result; } ``` In production, concrete implementations (ResendEmailService, TurnstileCaptchaVerifier) are injected into AppState. In tests, `MockEmailService` and `MockCaptchaVerifier` are used. ### 6.2 Backend Integration Tests **Test database setup:** Use `testcontainers-rs` to spin up a temporary Postgres container per test suite. ```rust // tests/common/mod.rs use testcontainers::{clients::Cli, images::postgres::Postgres}; use sqlx::PgPool; pub struct TestDb { _container: testcontainers::Container<'static, Postgres>, pub pool: PgPool, } impl TestDb { pub async fn new() -> Self { let docker = Cli::default(); let container = docker.run(Postgres::default()); let port = container.get_host_port_ipv4(5432); let url = format!("postgres://postgres:postgres@localhost:{}/postgres", port); let pool = PgPool::connect(&url).await.unwrap(); sqlx::migrate!("./migrations").run(&pool).await.unwrap(); TestDb { _container: container, pool, } } } ``` **Alternative (faster, less isolation):** Use a single shared Postgres container across tests with per-test schema isolation: ```rust pub async fn setup_test_schema(pool: &PgPool) -> String { let schema = format!("test_{}", uuid::Uuid::new_v4().to_string().replace('-', "")); sqlx::query(&format!("CREATE SCHEMA {}", schema)).execute(pool).await.unwrap(); sqlx::query(&format!("SET search_path TO {}", schema)).execute(pool).await.unwrap(); // Run migrations within the schema... schema } ``` **API endpoint tests:** Use `axum::test` helpers or `reqwest` against a running test server: ```rust // tests/api/auth_test.rs use axum::http::StatusCode; use axum_test::TestServer; #[tokio::test] async fn register_and_login_flow() { let db = TestDb::new().await; let app = build_test_app(db.pool.clone()).await; let server = TestServer::new(app).unwrap(); // 1. Register let resp = server .post("/api/v1/auth/register") .json(&json!({ "email": "test@example.com", "display_name": "Test User", "captcha_token": "test-token" // Mock captcha always succeeds in test })) .await; assert_eq!(resp.status_code(), StatusCode::OK); // 2. Extract magic token from DB (in production this would be sent via email) let token_row = sqlx::query!("SELECT token_hash FROM magic_tokens WHERE email = 'test@example.com'") .fetch_one(&db.pool) .await .unwrap(); // In tests, we also store the raw token for verification let raw_token = get_test_raw_token(&db.pool, "test@example.com").await; // 3. Verify magic link let resp = server .get(&format!("/api/v1/auth/verify?token={}", raw_token)) .await; assert_eq!(resp.status_code(), StatusCode::FOUND); // 302 redirect let session_cookie = resp.header("set-cookie"); assert!(session_cookie.contains("ai_synth_session")); // 4. Access protected endpoint let resp = server .get("/api/v1/auth/me") .add_header("cookie", session_cookie.clone()) .add_header("x-requested-with", "XMLHttpRequest") .await; assert_eq!(resp.status_code(), StatusCode::OK); let body: serde_json::Value = resp.json(); assert_eq!(body["email"], "test@example.com"); // 5. Logout let resp = server .post("/api/v1/auth/logout") .add_header("cookie", session_cookie) .add_header("x-requested-with", "XMLHttpRequest") .await; assert_eq!(resp.status_code(), StatusCode::OK); } ``` **Generation pipeline tests with mocked LLM:** ```rust #[tokio::test] async fn generate_synthesis_with_mocked_llm() { let db = TestDb::new().await; let mut mock_provider = MockLlmProvider::new(); // Mock Pass 1: return structured search results mock_provider .expect_generate_search_pass() .returning(|_, _, _, _| Ok(json!({ "category_0": [ { "title": "Test Article", "url": "https://example.com/article", "summary": "Test summary" } ] }))); // Mock Pass 2: return rewritten results mock_provider .expect_generate_rewrite_pass() .returning(|_, _, _, _| Ok(json!({ "category_0": [ { "title": "Rewritten Title", "url": "https://example.com/article", "summary": "Better summary" } ] }))); let app = build_test_app_with_provider(db.pool.clone(), Box::new(mock_provider)).await; // ... trigger generation and verify results } ``` **Key integration test scenarios:** 1. Full auth flow: register -> magic link -> verify -> session -> logout 2. CRUD operations: create/read/update/delete for sources, settings, syntheses 3. Ownership isolation: user A cannot access user B's data 4. Admin operations: create admin -> configure providers -> manage rate limits 5. Rate limiting: verify requests are throttled after exceeding limits 6. CSRF protection: verify requests without `X-Requested-With` header are rejected 7. Generation pipeline: full async generation with SSE progress events (mocked LLM) 8. Email delivery: magic link email rendering (mocked SMTP) 9. API key encryption: store and retrieve encrypted keys through the API **Test fixtures:** ```rust // tests/fixtures/mod.rs pub async fn create_test_user(pool: &PgPool, email: &str) -> User { sqlx::query_as!( User, r#"INSERT INTO users (email, display_name, role) VALUES ($1, 'Test User', 'user') RETURNING *"#, email ) .fetch_one(pool) .await .unwrap() } pub async fn create_test_admin(pool: &PgPool, email: &str) -> User { sqlx::query_as!( User, r#"INSERT INTO users (email, display_name, role) VALUES ($1, 'Admin User', 'admin') RETURNING *"#, email ) .fetch_one(pool) .await .unwrap() } pub async fn create_test_session(pool: &PgPool, user_id: &Uuid) -> String { let raw_token = generate_session_token(); let hash = sha256_hex(&raw_token); sqlx::query!( r#"INSERT INTO sessions (session_hash, user_id, expires_at) VALUES ($1, $2, now() + interval '30 days')"#, hash, user_id ) .execute(pool) .await .unwrap(); raw_token } pub async fn create_test_synthesis(pool: &PgPool, user_id: &Uuid) -> Synthesis { sqlx::query_as!( Synthesis, r#"INSERT INTO syntheses (user_id, week, sections, status) VALUES ($1, '2026-W12', $2, 'completed') RETURNING *"#, user_id, json!([{ "title": "Test Section", "items": [{ "title": "Article", "url": "https://example.com", "summary": "Summary" }] }]) ) .fetch_one(pool) .await .unwrap() } ``` ### 6.3 Frontend Unit Tests **Tooling:** Vitest + `@solidjs/testing-library` + `jsdom` ```json // frontend/package.json (devDependencies to add) { "devDependencies": { "vitest": "^3.0", "@solidjs/testing-library": "^0.8", "@testing-library/jest-dom": "^6.6", "jsdom": "^26.0", "msw": "^2.7" } } ``` ```typescript // frontend/vitest.config.ts import { defineConfig } from 'vitest/config'; import solidPlugin from 'vite-plugin-solid'; export default defineConfig({ plugins: [solidPlugin()], test: { environment: 'jsdom', globals: true, setupFiles: ['./src/test/setup.ts'], transformMode: { web: [/\.tsx?$/], }, }, }); ``` **Component rendering tests:** ```typescript // frontend/src/components/__tests__/SourceList.test.tsx import { render, screen, fireEvent } from '@solidjs/testing-library'; import { describe, it, expect, vi } from 'vitest'; import SourceList from '../SourceList'; describe('SourceList', () => { it('renders a list of sources', () => { const sources = [ { id: '1', title: 'TechCrunch', url: 'https://techcrunch.com', created_at: '2026-03-21' }, { id: '2', title: 'Ars Technica', url: 'https://arstechnica.com', created_at: '2026-03-21' }, ]; render(() => ); expect(screen.getByText('TechCrunch')).toBeInTheDocument(); expect(screen.getByText('Ars Technica')).toBeInTheDocument(); }); it('calls onDelete when delete button is clicked', async () => { const onDelete = vi.fn(); const sources = [ { id: '1', title: 'TechCrunch', url: 'https://techcrunch.com', created_at: '2026-03-21' }, ]; render(() => ); const deleteButton = screen.getByRole('button', { name: /supprimer/i }); await fireEvent.click(deleteButton); expect(onDelete).toHaveBeenCalledWith('1'); }); }); ``` **Signal/store logic tests:** ```typescript // frontend/src/stores/__tests__/settings.test.ts import { describe, it, expect } from 'vitest'; import { createRoot } from 'solid-js'; import { createSettingsStore } from '../settings'; describe('Settings Store', () => { it('initializes with default categories', () => { createRoot((dispose) => { const [settings] = createSettingsStore(); expect(settings.categories.length).toBe(5); expect(settings.categories[0]).toBe('Annonces majeures'); dispose(); }); }); it('validates max categories limit', () => { createRoot((dispose) => { const [settings, { addCategory }] = createSettingsStore(); // Fill up to 20 categories for (let i = 0; i < 20; i++) { addCategory(`Category ${i}`); } expect(() => addCategory('Category 21')).toThrow(); dispose(); }); }); }); ``` **API client tests (MSW for mocking):** ```typescript // frontend/src/lib/__tests__/api.test.ts import { describe, it, expect, beforeAll, afterAll, afterEach } from 'vitest'; import { setupServer } from 'msw/node'; import { http, HttpResponse } from 'msw'; import { fetchApi } from '../api'; const server = setupServer( http.get('/api/v1/syntheses', () => { return HttpResponse.json([ { id: '1', week: '2026-W12', created_at: '2026-03-21T10:00:00Z', sections: [] }, ]); }), http.get('/api/v1/auth/me', () => { return HttpResponse.json({ id: '1', email: 'test@example.com', display_name: 'Test User', role: 'user', }); }), http.post('/api/v1/auth/logout', () => { return HttpResponse.json({ message: 'Logged out' }); }), ); beforeAll(() => server.listen()); afterEach(() => server.resetHandlers()); afterAll(() => server.close()); describe('fetchApi', () => { it('fetches syntheses list', async () => { const data = await fetchApi('/api/v1/syntheses'); expect(data).toHaveLength(1); expect(data[0].week).toBe('2026-W12'); }); it('throws on 401 unauthorized', async () => { server.use( http.get('/api/v1/syntheses', () => { return new HttpResponse(null, { status: 401 }); }), ); await expect(fetchApi('/api/v1/syntheses')).rejects.toThrow(); }); }); ``` **SSE handling tests:** ```typescript // frontend/src/lib/__tests__/sse.test.ts import { describe, it, expect, vi } from 'vitest'; import { createRoot } from 'solid-js'; import { createGenerationStream } from '../sse'; describe('SSE Generation Stream', () => { it('parses progress events correctly', () => { createRoot((dispose) => { const mockEventSource = { addEventListener: vi.fn(), close: vi.fn(), }; vi.stubGlobal('EventSource', vi.fn(() => mockEventSource)); const [status] = createGenerationStream('job-123'); // Simulate SSE events const onMessage = mockEventSource.addEventListener.mock.calls.find( ([event]: [string]) => event === 'progress' )?.[1]; onMessage?.({ data: JSON.stringify({ step: 'search', progress: 0.5 }) }); expect(status().step).toBe('search'); expect(status().progress).toBe(0.5); dispose(); }); }); }); ``` ### 6.4 End-to-End Tests **Tool: Playwright** Playwright is recommended over Cypress for its multi-browser support, better TypeScript integration, and built-in auto-waiting. ```typescript // e2e/playwright.config.ts import { defineConfig, devices } from '@playwright/test'; export default defineConfig({ testDir: './e2e/tests', fullyParallel: false, // Tests share state (auth), run sequentially forbidOnly: !!process.env.CI, retries: process.env.CI ? 2 : 0, workers: 1, reporter: process.env.CI ? 'github' : 'html', use: { baseURL: process.env.E2E_BASE_URL || 'http://localhost:3000', trace: 'on-first-retry', screenshot: 'only-on-failure', }, projects: [ { name: 'chromium', use: { ...devices['Desktop Chrome'] } }, ], webServer: process.env.CI ? undefined : { command: 'docker compose -f docker-compose.yml -f docker-compose.e2e.yml up', url: 'http://localhost:3000', reuseExistingServer: true, timeout: 120_000, }, }); ``` **Key E2E scenarios:** ```typescript // e2e/tests/auth.spec.ts import { test, expect } from '@playwright/test'; test.describe('Authentication Flow', () => { test('register new account and login via magic link', async ({ page }) => { await page.goto('/login'); // Fill registration form await page.fill('[name="email"]', 'e2e-test@example.com'); await page.fill('[name="displayName"]', 'E2E Test User'); // Turnstile is configured to auto-pass in test mode await page.click('button:has-text("Créer un compte")'); // Verify success message await expect(page.locator('text=Un lien de connexion a été envoyé')).toBeVisible(); // Extract magic link from Mailpit API const magicLink = await getMagicLinkFromMailpit('e2e-test@example.com'); // Click the magic link await page.goto(magicLink); // Should be redirected to dashboard await expect(page).toHaveURL('/'); await expect(page.locator('text=E2E Test User')).toBeVisible(); }); test('logout clears session', async ({ page }) => { // Login first (helper) await loginAsTestUser(page); // Click logout await page.click('button:has-text("Déconnexion")'); // Should be redirected to login await expect(page).toHaveURL('/login'); }); }); ``` ```typescript // e2e/tests/synthesis.spec.ts import { test, expect } from '@playwright/test'; test.describe('Synthesis Generation', () => { test.beforeEach(async ({ page }) => { await loginAsTestUser(page); }); test('create source and generate synthesis', async ({ page }) => { // Add a source await page.goto('/sources'); await page.fill('[name="title"]', 'TechCrunch'); await page.fill('[name="url"]', 'https://techcrunch.com'); await page.click('button:has-text("Ajouter")'); await expect(page.locator('text=TechCrunch')).toBeVisible(); // Trigger generation await page.goto('/generate'); await page.click('button:has-text("Nouvelle synthèse")'); // Wait for SSE progress (generation is async) await expect(page.locator('[data-testid="generation-progress"]')).toBeVisible({ timeout: 10_000 }); // Wait for completion (in E2E with mocked LLM, this is fast) await expect(page.locator('text=Synthèse terminée')).toBeVisible({ timeout: 30_000 }); // Verify synthesis appears in list await page.goto('/'); await expect(page.locator('[data-testid="synthesis-card"]')).toHaveCount(1); }); }); ``` ```typescript // e2e/tests/admin.spec.ts import { test, expect } from '@playwright/test'; test.describe('Admin Configuration', () => { test.beforeEach(async ({ page }) => { await loginAsAdmin(page); }); test('admin configures LLM providers', async ({ page }) => { await page.goto('/admin/providers'); // Verify seeded providers appear await expect(page.locator('text=Google Gemini')).toBeVisible(); await expect(page.locator('text=OpenAI')).toBeVisible(); await expect(page.locator('text=Anthropic')).toBeVisible(); // Toggle a provider off const openaiToggle = page.locator('[data-provider="openai"] input[type="checkbox"]'); await openaiToggle.uncheck(); // Verify it saves await page.reload(); await expect(openaiToggle).not.toBeChecked(); }); }); ``` **Running E2E tests in Docker:** ```yaml # docker-compose.e2e.yml services: app: environment: # Use test Turnstile keys (always pass) - TURNSTILE_SECRET_KEY=1x0000000000000000000000000000000AA - TURNSTILE_SITE_KEY=1x00000000000000000000AA # Point to Mailpit for email - RESEND_API_KEY=test-key - EMAIL_SMTP_HOST=mailpit - EMAIL_SMTP_PORT=1025 # Use a mocked LLM provider for fast, deterministic tests - LLM_MOCK_MODE=true playwright: image: mcr.microsoft.com/playwright:v1.50.0-noble container_name: ai-synth-e2e working_dir: /app command: ["npx", "playwright", "test"] volumes: - ./e2e:/app/e2e - ./frontend:/app/frontend - playwright_results:/app/test-results environment: - E2E_BASE_URL=http://caddy:443 - MAILPIT_API_URL=http://mailpit:8025 depends_on: app: condition: service_healthy networks: - internal volumes: playwright_results: driver: local ``` ### 6.5 Test Summary Table | Layer | Tool | What | Run Command | |---|---|---|---| | Backend Unit | `cargo test` | Encryption, parsing, validation, rate limiting | `cargo test --lib` | | Backend Integration | `cargo test` + testcontainers | API endpoints, auth flows, DB operations | `cargo test --test '*'` | | Frontend Unit | Vitest | Components, signals, stores | `cd frontend && npm test` | | Frontend Integration | Vitest + MSW | API client, SSE, auth context | `cd frontend && npm test` | | E2E | Playwright | Full user flows | `npx playwright test` | --- ## 7. CI/CD Pipeline ### 7.1 GitHub Actions Workflow ```yaml # .github/workflows/ci.yml name: CI on: push: branches: [main] pull_request: branches: [main] env: CARGO_TERM_COLOR: always SQLX_OFFLINE: true jobs: # ============================================= # Backend: Lint + Test # ============================================= backend-check: name: Backend Lint & Test runs-on: ubuntu-latest services: postgres: image: postgres:17-alpine env: POSTGRES_USER: test POSTGRES_PASSWORD: test POSTGRES_DB: test ports: - 5432:5432 options: >- --health-cmd "pg_isready -U test" --health-interval 10s --health-timeout 5s --health-retries 5 steps: - uses: actions/checkout@v4 - name: Install Rust toolchain uses: dtolnay/rust-toolchain@stable with: components: clippy, rustfmt - name: Cache cargo registry & build uses: actions/cache@v4 with: path: | ~/.cargo/registry ~/.cargo/git target key: cargo-${{ runner.os }}-${{ hashFiles('**/Cargo.lock') }} restore-keys: cargo-${{ runner.os }}- - name: Check formatting run: cargo fmt --all -- --check - name: Run clippy run: cargo clippy --all-targets --all-features -- -D warnings - name: Run unit tests run: cargo test --lib - name: Run integration tests env: DATABASE_URL: postgres://test:test@localhost:5432/test MASTER_ENCRYPTION_KEY: "0000000000000000000000000000000000000000000000000000000000000000" SESSION_SECRET: "0000000000000000000000000000000000000000000000000000000000000000" APP_URL: "http://localhost:8080" RESEND_API_KEY: "re_test" EMAIL_FROM: "test@test.com" TURNSTILE_SECRET_KEY: "1x0000000000000000000000000000000AA" TURNSTILE_SITE_KEY: "1x00000000000000000000AA" run: cargo test --test '*' - name: Security audit run: | cargo install cargo-audit --locked cargo audit # ============================================= # Frontend: Lint + Test # ============================================= frontend-check: name: Frontend Lint & Test runs-on: ubuntu-latest defaults: run: working-directory: frontend steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: 22 cache: npm cache-dependency-path: frontend/package-lock.json - name: Install dependencies run: npm ci - name: Run ESLint run: npm run lint - name: Run type check run: npx tsc --noEmit - name: Run unit & integration tests run: npm test -- --run - name: Build (verify production build works) run: npm run build - name: Security audit run: npm audit --audit-level=high # ============================================= # Docker: Build image # ============================================= docker-build: name: Build Docker Image runs-on: ubuntu-latest needs: [backend-check, frontend-check] steps: - uses: actions/checkout@v4 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Build image (no push) uses: docker/build-push-action@v6 with: context: . push: false tags: ai-weekly-synth:${{ github.sha }} cache-from: type=gha cache-to: type=gha,mode=max # ============================================= # Docker: Push image on main # ============================================= docker-push: name: Push Docker Image runs-on: ubuntu-latest needs: [docker-build] if: github.ref == 'refs/heads/main' && github.event_name == 'push' steps: - uses: actions/checkout@v4 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Login to GitHub Container Registry uses: docker/login-action@v3 with: registry: ghcr.io username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Build and push uses: docker/build-push-action@v6 with: context: . push: true tags: | ghcr.io/${{ github.repository }}:latest ghcr.io/${{ github.repository }}:${{ github.sha }} cache-from: type=gha cache-to: type=gha,mode=max ``` ### 7.2 Caching Strategy | What | Cache key | Tool | |---|---|---| | Cargo registry + git index | `cargo-{os}-{hash(Cargo.lock)}` | `actions/cache` | | Cargo target (compiled deps) | Included in the cargo cache above | `actions/cache` | | npm node_modules | `{os}-node-{hash(package-lock.json)}` | `actions/setup-node` built-in cache | | Docker layers | GitHub Actions cache backend (`type=gha`) | `docker/build-push-action` | **cargo-chef in CI**: The Dockerfile already uses cargo-chef, so Docker layer caching handles Rust dependency caching for the image build step. The separate `backend-check` job caches cargo artifacts via `actions/cache` for the lint+test steps. ### 7.3 Optional: E2E Tests in CI ```yaml # Add this job to the workflow when E2E tests are ready e2e: name: End-to-End Tests runs-on: ubuntu-latest needs: [docker-build] steps: - uses: actions/checkout@v4 - name: Start services run: docker compose -f docker-compose.yml -f docker-compose.e2e.yml up -d --wait - name: Run Playwright tests run: docker compose -f docker-compose.yml -f docker-compose.e2e.yml run --rm playwright - name: Upload test results if: always() uses: actions/upload-artifact@v4 with: name: playwright-report path: test-results/ - name: Stop services if: always() run: docker compose -f docker-compose.yml -f docker-compose.e2e.yml down -v ``` --- ## 8. Monitoring & Observability ### 8.1 Structured Logging Use the `tracing` crate with JSON output for machine-parseable logs. ```rust // src/main.rs use tracing_subscriber::{fmt, prelude::*, EnvFilter}; fn init_tracing() { tracing_subscriber::registry() .with(EnvFilter::from_default_env()) // Reads RUST_LOG .with( fmt::layer() .json() // JSON output .with_target(true) // Include module path .with_thread_ids(false) .with_file(true) .with_line_number(true) ) .init(); } ``` **Sensitive data filtering**: Never log API keys, session tokens, or email addresses. Use the `secrecy` crate's `SecretString` which implements `Debug` as `Secret([REDACTED])`. ```rust // Example: structured log for a generation request tracing::info!( user_id = %user.id, provider = %settings.ai_provider, model = %settings.ai_model, categories_count = settings.categories.len(), "Starting synthesis generation" ); ``` ### 8.2 Health Check Endpoint ```rust // src/handlers/health.rs #[derive(Serialize)] pub struct HealthResponse { status: &'static str, version: &'static str, database: &'static str, uptime_seconds: u64, } pub async fn health_check(State(state): State) -> impl IntoResponse { let db_ok = sqlx::query("SELECT 1") .execute(&state.pool) .await .is_ok(); let status = if db_ok { "healthy" } else { "degraded" }; let db_status = if db_ok { "connected" } else { "disconnected" }; let http_status = if db_ok { StatusCode::OK } else { StatusCode::SERVICE_UNAVAILABLE }; ( http_status, Json(HealthResponse { status, version: env!("CARGO_PKG_VERSION"), database: db_status, uptime_seconds: state.start_time.elapsed().as_secs(), }), ) } ``` ### 8.3 Key Metrics to Monitor | Metric | Source | What it tells you | |---|---|---| | HTTP request count & latency (by route, status) | Tower middleware or tracing spans | Overall API health, slow endpoints | | Generation duration (P50, P95, P99) | Timed spans in synthesis service | LLM API performance | | Generation success/failure rate | Logged events | LLM reliability | | Active sessions count | Periodic DB query | User engagement | | Rate limiter rejections | Rate limiter middleware logs | Abuse detection | | Auth endpoint error rate | Handler logs | Brute-force attempts | | Database query latency | sqlx instrumentation | DB performance | | Scraper success/failure rate | Scraper service logs | Target site availability | | Memory/CPU usage | Docker stats / cAdvisor | Resource planning | ### 8.4 Lightweight Monitoring Stack (Optional) For a self-hosted deployment, a lightweight monitoring stack can be added as optional docker-compose services: ```yaml # docker-compose.monitoring.yml (optional override) services: # Metrics collection prometheus: image: prom/prometheus:latest container_name: ai-synth-prometheus volumes: - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml:ro - prometheus_data:/prometheus networks: - internal profiles: - monitoring # Dashboard grafana: image: grafana/grafana:latest container_name: ai-synth-grafana environment: - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD:-admin} volumes: - grafana_data:/var/lib/grafana ports: - "3001:3000" networks: - internal profiles: - monitoring # Log aggregation (lightweight alternative to ELK) loki: image: grafana/loki:latest container_name: ai-synth-loki volumes: - loki_data:/loki networks: - internal profiles: - monitoring volumes: prometheus_data: grafana_data: loki_data: ``` **Simpler alternative**: For most self-hosted deployments, structured JSON logs piped to `docker logs` and monitored with `docker stats` are sufficient. Add Prometheus/Grafana only if operational visibility becomes a need. **Exposing Prometheus metrics from the Rust app:** Use the `metrics` crate with `metrics-exporter-prometheus`: ```rust // In router setup use metrics_exporter_prometheus::PrometheusBuilder; let prometheus_handle = PrometheusBuilder::new() .install_recorder() .expect("failed to install Prometheus recorder"); // Add a /metrics endpoint (not exposed externally, only on internal network) Router::new() .route("/metrics", get(move || async move { prometheus_handle.render() })) ``` --- ## 9. Documentation ### 9.1 README.md Structure ``` # AI Weekly Synth AI-powered weekly news synthesis generator. Self-hosted, multi-provider (Gemini, OpenAI, Anthropic). ## Features - ... ## Quick Start (Docker) 1. Clone the repository 2. Copy .env.example to .env and configure 3. docker compose up -d 4. Create admin: docker compose run --rm app ./ai-weekly-synth create-admin admin@example.com 5. Open https://your-domain.com ## Documentation - [Deployment Guide](docs/deployment.md) - [Admin Guide](docs/admin-guide.md) - [Developer Guide](docs/developer-guide.md) - [Architecture Overview](docs/architecture.md) ## Tech Stack - Backend: Rust (Axum) + Postgres - Frontend: SolidJS + Tailwind CSS - Infrastructure: Docker, Caddy, Resend ## License [TBD] ``` ### 9.2 Deployment Guide Outline ``` # Deployment Guide ## Prerequisites - Linux server (2 GB RAM minimum, 10 GB disk) - Docker Engine 24+ and Docker Compose v2 - Domain name with DNS pointing to the server - Resend account (free tier: 3,000 emails/month) - Cloudflare Turnstile keys (free) ## Step-by-Step Deployment 1. Clone the repository 2. Configure environment variables (.env) 3. Configure domain (Caddyfile) 4. Start the stack: docker compose up -d 5. Run initial setup: - docker compose run --rm app ./ai-weekly-synth migrate - docker compose run --rm app ./ai-weekly-synth seed-providers - docker compose run --rm app ./ai-weekly-synth create-admin admin@example.com 6. Verify: open https://your-domain.com ## DNS Configuration - A record: your-domain.com -> server IP - Resend domain verification (SPF, DKIM) ## Updating 1. git pull 2. docker compose build 3. docker compose up -d (Migrations run automatically on startup) ## Troubleshooting - Check logs: docker compose logs app - Check health: curl https://your-domain.com/api/v1/health - Database: docker exec -it ai-synth-db psql -U ai_synth ``` ### 9.3 Admin Guide Outline ``` # Admin Guide ## First-Time Setup 1. Create your admin account (CLI command) 2. Log in via magic link 3. Navigate to Admin > Providers 4. Configure at least one LLM provider (enable/disable, set models) 5. Configure rate limits (optional, defaults are sensible) ## Managing Providers - Enable/disable providers - Update available model list - Rate limit configuration per provider ## Managing Users - View all registered users - Promote/demote admin role - Revoke user sessions (security response) ## Backups - Automated daily backups (cron) - Manual backup: ./backup.sh - Restore procedure ## Monitoring - Health endpoint: /api/v1/health - Logs: docker compose logs -f app - Database size: docker exec ai-synth-db psql -U ai_synth -c "SELECT pg_size_pretty(pg_database_size('ai_synth'))" ## Security - Rotating the master encryption key - Rotating the session secret - Updating Caddy/Postgres/app images ``` ### 9.4 Developer Guide Outline ``` # Developer Guide ## Prerequisites - Rust 1.85+ (rustup) - Node.js 22+ (nvm recommended) - Docker (for Postgres and Mailpit) - sqlx-cli: cargo install sqlx-cli --no-default-features --features postgres ## Local Development Setup 1. Start Postgres and Mailpit (Docker) 2. Copy .env.example to .env, set DATABASE_URL 3. Run migrations: sqlx migrate run 4. Start backend: cargo watch -x 'run -- serve' 5. Start frontend: cd frontend && npm install && npm run dev 6. Open http://localhost:3000 ## Project Structure (Directory tree with descriptions) ## Running Tests - Backend unit: cargo test --lib - Backend integration: cargo test --test '*' - Frontend: cd frontend && npm test - E2E: npx playwright test ## Database - Creating a migration: sqlx migrate add description - Running migrations: sqlx migrate run - Preparing offline data: cargo sqlx prepare ## Code Conventions - Rust: follow clippy warnings, use thiserror for errors - Frontend: ESLint config, Tailwind class ordering - Commits: conventional commits format ## Adding a New LLM Provider 1. Implement the LlmProvider trait in src/services/llm/ 2. Register in the provider factory 3. Add provider to the migration seed 4. Update frontend provider selector ``` --- ## 10. Security Hardening Checklist ### Docker - [ ] Non-root user in Dockerfile (`USER appuser`) - [ ] Read-only root filesystem (`read_only: true`) with tmpfs for `/tmp` - [ ] No secrets baked into the image (all via env vars at runtime) - [ ] `no-new-privileges:true` security option - [ ] `cap_drop: ALL` (drop all Linux capabilities) - [ ] `.dockerignore` excludes `.env`, `.git`, `target/`, sensitive files - [ ] Base images pinned to specific versions (not just `:latest`) - [ ] Multi-stage build (no compiler, source code, or build tools in runtime image) ### Postgres - [ ] Strong random password (minimum 32 characters) - [ ] Not exposed to host network in production (only on internal Docker network) - [ ] No default `postgres` superuser password (use application-specific user) - [ ] Health check configured - [ ] Persistent volume for data - [ ] Regular backups with tested restore procedure ### TLS / HTTPS - [ ] HTTPS everywhere (Caddy with automatic Let's Encrypt) - [ ] HTTP automatically redirected to HTTPS - [ ] HSTS header with `max-age=31536000; includeSubDomains` - [ ] TLS 1.2+ only (Caddy default) - [ ] The Rust app only listens on internal network, not exposed directly ### Security Headers - [ ] `Content-Security-Policy` (restrictive, script-src 'self') - [ ] `X-Frame-Options: DENY` - [ ] `X-Content-Type-Options: nosniff` - [ ] `Referrer-Policy: strict-origin-when-cross-origin` - [ ] `Permissions-Policy` disabling unused browser APIs - [ ] `Strict-Transport-Security` (set by Caddy) - [ ] Server header removed (Caddy's `-Server`) ### Authentication & Sessions - [ ] Session cookies: `HttpOnly`, `Secure`, `SameSite=Lax`, `Path=/` - [ ] Session IDs: 32 bytes cryptographic random, SHA-256 hashed in DB - [ ] Magic link tokens: 32 bytes random, 15-minute expiry, single-use, hashed in DB - [ ] Account enumeration prevention (identical responses for existing/non-existing emails) - [ ] CSRF protection via `X-Requested-With` header requirement on mutating requests - [ ] 30-day session expiration - [ ] Logout invalidates server-side session ### Rate Limiting - [ ] Auth endpoints: `POST /auth/register` 3/hour per IP, `POST /auth/login` 5/15min per IP, 3/hour per email - [ ] Generation endpoint: 3/hour per user - [ ] General API: 120 reads/min per user, 30 writes/min per user - [ ] Global rate limit as DDoS protection ### Data Protection - [ ] User LLM API keys encrypted at rest with AES-256-GCM - [ ] Master encryption key in environment variable, never in code or logs - [ ] API keys never sent to frontend (only key prefix for display) - [ ] `secrecy` crate wrapping all sensitive values (prevents accidental logging) - [ ] Parameterized SQL queries everywhere (sqlx compile-time checking) - [ ] User data strictly isolated (`WHERE user_id = $1` on every query) ### SSRF Prevention (URL Scraping) - [ ] Block private IP ranges (10.x, 172.16.x, 192.168.x, 127.x, 169.254.169.254) - [ ] Only allow `http://` and `https://` schemes - [ ] Connection timeout: 5s, response timeout: 15s, total: 30s - [ ] Response size limit: 5 MB - [ ] Redirect limit: 3 hops, each validated against IP blocklist ### Dependency Security - [ ] `cargo audit` in CI (checks for known vulnerabilities in Rust dependencies) - [ ] `npm audit` in CI (checks for known vulnerabilities in npm packages) - [ ] Dependabot or Renovate configured for automated dependency updates - [ ] No `unsafe` blocks unless absolutely necessary and audited ### Input Validation - [ ] All request bodies validated with `serde` + `validator` crate - [ ] URL inputs validated (scheme allowlist, length limit) - [ ] User prompt inputs length-limited - [ ] File upload size limits (CSV import) - [ ] LLM-generated content treated as untrusted (no `innerHTML`) ### Operational - [ ] `.env` file permissions `600` (owner read/write only) - [ ] Secrets never passed as CLI arguments - [ ] Secrets never logged - [ ] Audit log for admin actions (append-only) - [ ] Backup encryption for off-site storage - [ ] Periodic backup restore testing --- ## Appendix: First-Time Deployment Checklist This is the step-by-step sequence for deploying AI Weekly Synth for the first time. ``` 1. Provision a Linux server (2+ GB RAM, Docker installed) 2. Point your domain DNS A record to the server IP 3. Clone the repository 4. Create .env from .env.example: - Generate MASTER_ENCRYPTION_KEY: openssl rand -hex 32 - Generate SESSION_SECRET: openssl rand -hex 32 - Generate POSTGRES_PASSWORD: openssl rand -hex 24 - Set APP_URL to https://your-domain.com - Set RESEND_API_KEY from Resend dashboard - Set EMAIL_FROM (must match verified Resend domain) - Set TURNSTILE_SECRET_KEY and TURNSTILE_SITE_KEY from Cloudflare dashboard 5. Update Caddyfile: replace synth.example.com with your domain 6. chmod 600 .env 7. docker compose up -d 8. Wait for healthy status: docker compose ps 9. Create admin: docker compose run --rm app ./ai-weekly-synth create-admin your@email.com 10. Seed providers: docker compose run --rm app ./ai-weekly-synth seed-providers 11. Open https://your-domain.com, request a magic link, log in 12. Go to Admin > Providers, verify the provider catalog 13. Set up backup cron: crontab -e -> add the backup.sh entry 14. Verify backup: run backup.sh manually, check the file was created ```