docs: add consolidated dev_guidelines.md, qa_guidelines.md, deployment.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
master
oabrivard 3 months ago
parent 116189d11b
commit d3a4d2c577

@ -0,0 +1,257 @@
# Deployment Guide
## Docker Deployment
AI Weekly Synth is designed for Docker-only deployment. The `docker-compose.yml` at the project root orchestrates the application and its PostgreSQL database.
### Quick Start
```bash
# 1. Clone the repository
git clone <repo-url>
cd ai_synth
# 2. Create and configure .env
cp .env.example .env
# Edit .env and fill in all values (see Environment Variables below)
# 3. Start the stack
docker compose up -d
# 4. Create the first admin user
docker exec ai-synth ./ai-synth-backend create-admin admin@example.com
```
The application will be available at `http://localhost:8080` (or the port configured in `PORT`).
### Docker Compose Services
The `docker-compose.yml` defines two services:
**app** (AI Weekly Synth backend + frontend):
- Multi-stage Docker image: Node.js builds the frontend, Rust builds the backend, then both are combined into a minimal Debian runtime
- Runs as a non-root user (`appuser`)
- Depends on `db` with a health check condition (waits for Postgres to be ready)
- Health check: `curl -f http://localhost:8080/api/v1/health` every 30 seconds
- Restart policy: `unless-stopped`
**db** (PostgreSQL 17 Alpine):
- Data persisted to a named Docker volume (`postgres_data`)
- Exposed on `127.0.0.1:5432` (localhost only, not accessible from external networks)
- Health check: `pg_isready` every 10 seconds
- Shared memory: 128 MB
- Restart policy: `unless-stopped`
### Dockerfile Details
The `backend/Dockerfile` uses a three-stage build:
1. **frontend-builder** (Node.js 22 Alpine): Runs `npm ci` and `npm run build` to produce the static frontend in `/app/dist/`
2. **builder** (Rust 1.88 Bookworm): Compiles the Rust backend in release mode with `SQLX_OFFLINE=true` (no live database needed during build)
3. **runtime** (Debian Bookworm Slim): Installs only `ca-certificates`, `libssl3`, and `curl`. Copies the binary, migrations, and frontend static files. Runs as non-root.
---
## Environment Variables
All environment variables are documented in `.env.example`. The `.env` file is loaded by Docker Compose.
### Required
| Variable | Description | Example |
|----------|-------------|---------|
| `DATABASE_URL` | PostgreSQL connection string. In docker-compose, the hostname is `db`. | `postgres://ai_synth:secret@db:5432/ai_synth` |
| `POSTGRES_PASSWORD` | Password for the PostgreSQL user. Used by both the `db` service and in `DATABASE_URL`. | `a-strong-random-password` |
| `MASTER_ENCRYPTION_KEY` | 256-bit key for AES-256-GCM encryption of user API keys at rest. Must be exactly 64 hex characters. Generate with `openssl rand -hex 32`. **Back this up securely -- losing it means all stored API keys become unreadable.** | `ab12cd34...` (64 hex chars) |
| `APP_URL` | Public URL where the app is accessible (no trailing slash). Used for magic link URLs, CORS origin, and cookie domain. | `https://synth.example.com` |
| `RESEND_API_KEY` | API key for Resend (email service). Required for magic link emails and synthesis email export. Sign up at https://resend.com. | `re_xxxxx` |
| `EMAIL_FROM` | Sender address for emails. Must be a verified domain in Resend. | `AI Weekly Synth <noreply@synth.example.com>` |
| `TURNSTILE_SECRET_KEY` | Server-side secret key for Cloudflare Turnstile captcha. Sign up at https://dash.cloudflare.com/turnstile. | `0x4AAAAAAA...` |
| `TURNSTILE_SITE_KEY` | Client-side site key for Cloudflare Turnstile. | `0x4BBBBBB...` |
### Optional
| Variable | Description | Default |
|----------|-------------|---------|
| `PORT` | Port for the backend HTTP server (inside the container). The docker-compose maps this to the host. | `8080` |
| `RUST_LOG` | Logging level. Format: `level` or `level,crate=level`. | `info,ai_synth_backend=debug` |
| `STATIC_DIR` | Path to the built frontend files. In Docker, this is `./static` (set by docker-compose). For local dev, use `../frontend/dist`. | `./static` (Docker) |
| `SESSION_SECRET` | Secret for session cookie signing. At least 64 characters. If not set, a random value is generated at startup (sessions will not survive restarts). | Random |
---
## Database
### PostgreSQL
The application uses PostgreSQL 17. The `docker-compose.yml` runs it as the `db` service with a named volume for data persistence.
Key configuration:
- User: `ai_synth` (configurable via `POSTGRES_PASSWORD`)
- Database: `ai_synth`
- Shared memory: 128 MB (for complex queries)
- Health check via `pg_isready`
### Automatic Migrations
Database migrations run automatically every time the application starts. The backend calls `sqlx::migrate!("./migrations")` in `main.rs` before starting the HTTP server. There are currently 30 migration files covering all schema changes from initial setup through themes, schedules, article history, and LLM call logging.
No manual migration step is needed. The application will not start serving requests until migrations complete successfully.
### Tables
The database contains the following tables:
| Table | Purpose |
|-------|---------|
| `users` | User accounts (email, display name, role) |
| `sessions` | Active sessions (hashed tokens, expiry) |
| `magic_link_tokens` | Passwordless login tokens |
| `user_settings` | Per-user configuration (provider, model, batch size, etc.) |
| `sources` | User-defined news sources (URLs, titles, themes) |
| `syntheses` | Generated synthesis results (sections as JSONB) |
| `admin_providers` | Admin-curated LLM providers and models |
| `admin_rate_limits` | Admin-configured rate limits per provider |
| `user_api_keys` | Encrypted LLM API keys |
| `audit_log` | Admin action audit trail |
| `article_history` | Previously seen article URLs for dedup |
| `llm_call_log` | LLM API call logs (prompts, responses, timing) |
| `themes` | User-defined synthesis themes (topic, categories, settings) |
| `theme_schedules` | Automated generation schedules per theme |
---
## Background Tasks
The application starts two background tasks automatically on startup. No external cron or scheduler is needed.
### Session Cleanup (hourly)
Every hour, a background task deletes expired sessions from the `sessions` table. This prevents unbounded growth of the sessions table. The task logs the number of deleted sessions.
### Scheduled Synthesis Generation (every 60 seconds)
Every 60 seconds, a background task checks for due theme schedules (matching the current day of the week and time in UTC). For each due schedule, it:
1. Runs the synthesis generation pipeline for the associated theme
2. Sends the result via email to the configured recipients (up to 3)
3. Marks the schedule as run (updates `last_run_at`) to prevent re-execution on the same day
This is a single-instance scheduler -- it does not use distributed locks. Do not run multiple instances of the application if scheduled generation is enabled (it would cause duplicate executions).
---
## Monitoring
### Health Check
The `/api/v1/health` endpoint returns HTTP 200 when the application is running and can serve requests. It is used by:
- Docker's built-in health check (configured in both `docker-compose.yml` and `Dockerfile`)
- External monitoring tools
```bash
curl -f http://localhost:8080/api/v1/health
```
### Logs
The application uses structured logging via the `tracing` crate. Log level is controlled by the `RUST_LOG` environment variable.
Recommended production setting:
```
RUST_LOG=info,ai_synth_backend=debug
```
This provides:
- `info` level for all crates (HTTP requests, startup/shutdown, background tasks)
- `debug` level for the application code (detailed pipeline progress, LLM call timing)
Logs go to stdout, which Docker captures and makes available via `docker logs ai-synth`.
To view logs:
```bash
docker logs ai-synth # all logs
docker logs ai-synth --tail 100 # last 100 lines
docker logs ai-synth -f # follow live
```
---
## Backup
### Database
The PostgreSQL data volume (`postgres_data`) is the only stateful component. Back it up regularly:
```bash
# Dump the database
docker exec ai-synth-db pg_dump -U ai_synth ai_synth > backup_$(date +%Y%m%d).sql
# Restore from a dump
cat backup_20260327.sql | docker exec -i ai-synth-db psql -U ai_synth ai_synth
```
### No File Storage
The application does not store files on disk. All data (syntheses, settings, API keys, article history) lives in PostgreSQL. The frontend is served from static files baked into the Docker image.
### Encryption Key
The `MASTER_ENCRYPTION_KEY` is critical. If lost, all user API keys stored in the database become permanently unreadable. Store it securely (e.g., in a secrets manager) and include it in your disaster recovery plan.
---
## Updating
To update to a new version:
```bash
# 1. Pull the latest code
git pull
# 2. Rebuild the Docker image and restart
docker compose up -d --build
```
This will:
1. Rebuild the Docker image (frontend build + Rust compilation)
2. Restart the `app` container with the new image
3. Automatically run any new migrations on startup
4. The `db` container is unaffected (data persists in the named volume)
The restart causes a brief downtime (typically 10-30 seconds for the health check to pass). For zero-downtime deployments, consider running behind a reverse proxy with health-check-based routing.
---
## Security Checklist
Before deploying to production, verify:
- [ ] **`MASTER_ENCRYPTION_KEY`** is set to a random 64 hex character value (not the example value). Generated with `openssl rand -hex 32`. Stored securely and backed up.
- [ ] **`POSTGRES_PASSWORD`** is set to a strong random password.
- [ ] **HTTPS** is configured. Set `APP_URL` to an `https://` URL. The application sets `Secure` on cookies when `APP_URL` starts with `https`. Use a reverse proxy (nginx, Caddy, Traefik) to terminate TLS.
- [ ] **Turnstile** keys are configured. Without them, the registration and login forms will not work (captcha is required).
- [ ] **Resend** API key is configured with a verified sending domain.
- [ ] **`SKIP_SSRF_CHECK`** is NOT set. This env var disables SSRF protection and should only be used in test environments.
- [ ] **Postgres** is not exposed to the internet. The docker-compose binds it to `127.0.0.1:5432` by default.
- [ ] **Docker socket** is not exposed. The app does not need Docker access.
- [ ] **Firewall** allows inbound traffic only on the app port (8080 or whichever port is mapped).
- [ ] **Reverse proxy** is configured to forward `X-Forwarded-For` and `X-Forwarded-Proto` headers if the app is behind a proxy.
### Security Features (Built-in)
The application includes the following security measures that require no additional configuration:
- **AES-256-GCM encryption** for user LLM API keys at rest (per-key random nonces)
- **SSRF prevention** in the web scraper (DNS resolution checks, private IP blocking, redirect validation)
- **CSRF protection** via `X-Requested-With` header on all mutating API endpoints
- **Session cookies**: `HttpOnly`, `SameSite=Lax`, `Secure` (when HTTPS)
- **Security headers**: CSP, X-Frame-Options (DENY), X-Content-Type-Options (nosniff), Referrer-Policy, HSTS (when HTTPS)
- **Anti-enumeration**: Same response for existent/non-existent emails in auth flows
- **Error sanitization**: Internal errors and API key patterns are stripped from client-facing error messages
- **Rate limiting**: Configurable per-provider rate limits for LLM API calls
- **Non-root container**: The Docker image runs as `appuser`
- **Graceful shutdown**: SIGTERM/Ctrl+C triggers clean shutdown with database pool closure

@ -0,0 +1,340 @@
# Development Guidelines
## Getting Started
### Prerequisites
- **Rust** (stable, 1.88+) with `cargo`
- **Node.js** (22+) with `npm`
- **PostgreSQL** (17+) -- or Docker to run it
- **Docker** and `docker compose` for containerized development
### Local Development Setup
1. **Start a Postgres instance.** The easiest way is via the test compose file:
```bash
cd e2e && docker compose -f docker-compose.test.yml up -d db
```
This starts Postgres on port 5433 with user `ai_synth_test` / password `testpassword`.
2. **Run the backend:**
```bash
cd backend
export DATABASE_URL=postgres://ai_synth_test:testpassword@127.0.0.1:5433/ai_synth_test
cargo run -- serve
```
Migrations run automatically on startup.
3. **Run the frontend (dev server with hot reload):**
```bash
cd frontend
npm install
npm run dev
```
The Vite dev server proxies `/api` requests to the backend on port 8080.
4. **Create an admin user:**
```bash
cd backend && cargo run -- create-admin admin@example.com
```
### Environment Variables
Copy `.env.example` to `.env` and fill in the values. The critical ones for local dev are `DATABASE_URL`, `MASTER_ENCRYPTION_KEY` (64 hex chars -- generate with `openssl rand -hex 32`), and `APP_URL`.
---
## Project Structure
```
ai_synth/
backend/ Rust/Axum backend
src/
main.rs Entry point, CLI (serve, create-admin)
router.rs All API routes + middleware
app_state.rs Shared application state (Arc-wrapped)
errors.rs Unified AppError enum
config.rs Environment config parsing
handlers/ HTTP handlers (thin: validate, delegate, respond)
services/ Business logic (auth, synthesis pipeline, LLM providers, scraper, etc.)
db/ Database queries (sqlx, parameterized)
models/ Data types + validation
middleware/ Auth session extraction, CSRF check
util/ Token generation, hashing
migrations/ SQL migrations (30 files, auto-run on startup)
tests/ Integration tests (require Postgres)
frontend/ SolidJS + Tailwind CSS v4
src/
App.tsx Router, layouts, route guards
pages/ Page-level components
components/ Reusable components (Button, LoadingSpinner, settings/*)
api/ API clients (one file per resource)
contexts/ AuthContext (session-based)
i18n/ French translations
utils/ SSE client, date formatting, provider info
types.ts All TypeScript domain types
e2e/ Playwright E2E tests
tests/ Test specs
helpers/ Auth helpers, DB access
seed.ts Test data seeder
scripts/ Test runner scripts
docs/ Architecture reports, plans, specs
```
### Layer Architecture (Backend)
```
handlers/ (HTTP layer) --> services/ (business logic) --> db/ (data access)
| |
models/ (shared types) <---------+
|
errors.rs
```
- **Handlers** are thin: validate input, call services/db, format responses.
- **Services** contain business logic. The `LlmProvider` trait and synthesis pipeline live here.
- **DB** modules contain pure SQL queries returning typed results. No business logic.
- **Models** define data types and validation. Shared across layers.
---
## Coding Standards
### Rust
#### Error Handling
All errors flow through the unified `AppError` enum (in `backend/src/errors.rs`):
```rust
#[derive(Debug, thiserror::Error)]
pub enum AppError {
NotFound(String), // 404
Unauthorized(String), // 401
Forbidden(String), // 403
BadRequest(String), // 400
Validation(String), // 422
Internal(anyhow::Error),// 500 -- details logged, not exposed to client
RateLimited(String), // 429
}
```
Key rules:
- **Never use `unwrap()` in production code.** Use `?`, `ok_or_else`, `map_err`, or `unwrap_or_default` with appropriate logging. `unwrap()` is only acceptable in `#[cfg(test)]` blocks and `LazyLock` static initializers.
- **`AppError::Internal` hides details** from the client. The full error is logged via `tracing::error!` but the response body only contains `"An internal error occurred"`.
- **`From<sqlx::Error>` and `From<anyhow::Error>`** conversions are implemented, so you can use `?` with both types.
- **Validation errors** should use `AppError::Validation(message)` (returns 422).
#### Arc Usage
`Arc` is used to share data across `tokio::spawn` boundaries. Common patterns:
- `Arc<dyn LlmProvider>` for the LLM provider (shared across classify tasks)
- `Arc<AtomicBool>` for cancellation flags
- `Arc<watch::Sender<ProgressEvent>>` for SSE progress channels
- `Arc<String>`, `Arc<Vec<String>>`, `Arc<Value>` for data shared with spawned tasks
#### Auth Middleware Pattern
Authentication uses Axum extractors (in `backend/src/middleware/auth.rs`):
- **`AuthUser`**: Reads the session cookie, looks up the session in the DB, checks expiration, loads the user. Any handler that takes `AuthUser` as a parameter automatically rejects unauthenticated requests with 401.
- **`AdminUser(AuthUser)`**: Wraps `AuthUser` and additionally checks `UserRole::Admin`. Returns 403 if not admin.
To require authentication, simply add the extractor to your handler signature:
```rust
async fn my_handler(auth: AuthUser, State(state): State<AppState>) -> Result<..., AppError> { ... }
```
For admin-only endpoints:
```rust
async fn admin_handler(admin: AdminUser, State(state): State<AppState>) -> Result<..., AppError> { ... }
```
#### Other Rust Conventions
- All SQL queries use parameterized bindings (`$1`, `$2`) via sqlx. Never interpolate strings into SQL.
- Prefer `tracing::info!`, `tracing::warn!`, `tracing::error!` over `println!`.
- Code comments and log messages are in English. User-facing strings are in French (via the i18n system).
- Module-level `//!` doc comments on every file; function-level `///` doc comments on public items.
### Frontend (SolidJS)
#### Reactive Primitives
- Use `createSignal` for local component state.
- Use `createResource` for async data that should auto-refetch (preferred over `createEffect` + manual fetch).
- Use `createMemo` for derived/computed values.
- Use `createEffect` for side effects that need to react to signal changes.
- Always use `onCleanup` to clear timers, close connections, and cancel subscriptions.
#### Component Patterns
- Use the `Button` component (`components/ui/Button.tsx`) with `variant`/`loading`/`icon` props instead of raw `<button>` elements with inline Tailwind classes.
- Use `<Switch>/<Match>` for mutually exclusive conditional rendering instead of multiple adjacent `<Show>` blocks.
- Use `<For each={...}>` for list rendering.
- Use the `useToast` context for user feedback (success/error notifications).
#### i18n
All user-facing strings go through the translation system in `frontend/src/i18n/fr.ts`. Use the `t()` function:
```tsx
import { t } from '~/i18n/fr';
// ...
<p>{t('settings.saved')}</p>
```
Never hardcode French strings directly in JSX.
#### TypeScript
- `tsconfig.json` has `strict: true`. No escape hatches.
- Domain types live in `frontend/src/types.ts`. Import them from there.
- API clients use generics for type safety (`get<T>`, `post<T>`, etc.).
- Use the `isApiError` type guard from `types.ts` in catch blocks.
#### Import Conventions
All imports use the `~/` alias (configured in Vite). No relative path imports across directories.
---
## Common Patterns
### Adding a New Setting
Follow this sequence when adding a new user-configurable setting:
1. **Migration**: Create a new SQL migration in `backend/migrations/` that adds the column with a default value:
```sql
ALTER TABLE user_settings ADD COLUMN my_new_setting INTEGER NOT NULL DEFAULT 5;
```
Naming: `YYYYMMDD00000N_add_my_new_setting.sql`
2. **Model** (`backend/src/models/settings.rs`): Add the field to both `UserSettings` and `UpdateSettingsRequest`. Add validation in `UpdateSettingsRequest::validate()`.
3. **DB** (`backend/src/db/settings.rs`): Update the `get_or_create_default` and `update` queries to include the new column.
4. **Frontend types** (`frontend/src/types.ts`): Add the field to `UserSettings` and `UpdateSettingsPayload`. Also update `DEFAULT_SETTINGS` in `Settings.tsx`.
5. **i18n** (`frontend/src/i18n/fr.ts`): Add translation keys for the label, description, and any validation messages.
6. **Settings UI** (`frontend/src/pages/Settings.tsx`): Add the form control. Use the appropriate input type (number, checkbox, select, etc.).
7. **Important**: The `PUT /settings` endpoint requires the **complete** settings payload (not a partial update). The frontend must always send all fields. If you add a field, update the `DEFAULT_SETTINGS` object to include it with a sensible default.
### Adding a New API Endpoint
1. **Handler** (`backend/src/handlers/`): Create the handler function. Use `AuthUser` or `AdminUser` extractors as needed. Return `Result<impl IntoResponse, AppError>`.
2. **Router** (`backend/src/router.rs`): Register the route. Place it in the correct section (public, authenticated, admin). Watch for path parameter conflicts -- more specific routes must be registered before generic `{id}` routes.
3. **Integration tests** (`backend/tests/`): Write tests covering:
- Happy path (200/201/204)
- Auth required (401 without session)
- Validation errors (422 for bad input)
- Not found (404 for missing resources)
- Ownership isolation (user A cannot access user B's resources)
- Admin-only access (403 for non-admin if applicable)
4. **Frontend**: Add the API client function in the appropriate `frontend/src/api/` file. Add TypeScript types if needed.
### Adding a New LLM Provider
The `LlmProvider` trait (in `backend/src/services/llm/mod.rs`) defines the contract:
```rust
#[async_trait]
pub trait LlmProvider: Send + Sync {
fn provider_id(&self) -> &str;
async fn call_llm(&self, model: &str, system_prompt: &str, user_prompt: &str, response_schema: &Value) -> Result<Value, AppError>;
}
```
Steps:
1. **Implement the trait**: Create `backend/src/services/llm/my_provider.rs`. Implement `LlmProvider`. Use `map_provider_http_error()` from `llm/mod.rs` for HTTP status mapping.
2. **Register in the module**: Add `pub mod my_provider;` to `backend/src/services/llm/mod.rs`.
3. **Add to the factory** (`backend/src/services/llm/factory.rs`): Add a match arm in `create_provider()`:
```rust
"my_provider" => Ok(Arc::new(MyProvider::new(api_key, http_client))),
```
4. **Add factory tests**: Test that `create_provider("my_provider", ...)` returns the correct provider.
5. **Admin setup**: The admin must add the provider via the admin UI (`/admin/providers`) with its available models before users can select it.
No changes to the pipeline are needed -- it uses the `LlmProvider` trait polymorphically.
---
## Git Workflow
### Commit Messages
Follow the conventional commits format used in this project:
```
type: short description
Longer explanation if needed.
```
Types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore`.
Examples from the repo:
- `fix: rewrite pass schema uses actual scraped item counts, not max setting`
- `fix: filter empty scraped articles + restore URLs after rewrite + E2E assertions`
- `docs: add spec and plan for source priority pipeline redesign`
### Rules
- Never force push to `master`.
- Create feature branches for non-trivial changes.
- Keep commits focused -- one logical change per commit.
---
## Common Pitfalls
### Drop Deadlock in Tests
The `TestApp` struct in `backend/tests/common/mod.rs` uses a `Drop` implementation that spawns a background thread to clean up the test database. **Do not call `.join()` on this thread** -- it causes a deadlock because the spawned thread's `block_on` conflicts with the existing tokio runtime's connection pool.
The `Drop` implementation fires and forgets the cleanup thread intentionally. For explicit cleanup, call `app.cleanup().await` at the end of the test instead.
### SSRF Bypass Environment Variable
The `SKIP_SSRF_CHECK=1` environment variable disables all SSRF protection in the scraper. It exists for integration tests (which use wiremock on localhost). **Never set this in production.** The `scripts/run-integration-tests.sh` script sets it automatically.
### Settings Payload Completeness
The `PUT /settings` endpoint requires the **complete** settings object, not a partial update. If you send a payload missing a field, the request will fail with a deserialization error. When writing integration tests, always include every field in the settings JSON. When adding a new setting field, update all existing test payloads.
### Pipeline Test Requirements
Pipeline integration tests require:
- A running Postgres instance (via `TEST_DATABASE_URL`)
- `SKIP_SSRF_CHECK=1` (to allow wiremock on localhost)
- Wiremock for mocking HTTP responses from source websites
- `MockLlmProvider` for deterministic LLM responses
The mock provider identifies call types by inspecting the system prompt content (e.g., looking for French keywords like "classer"). If you change prompt wording, the mock may need updating.
### Gemini API Key in URL
The Gemini provider places the API key in the URL query string (`?key=...`). The error handler avoids logging the full URL, but intermediary proxies or debug-level logging could expose it. Be aware of this when configuring logging levels.

@ -0,0 +1,375 @@
# QA Guidelines
## Test Inventory
| Type | Count | Status | Location |
|------|-------|--------|----------|
| Backend unit tests | 358 | All passing | `backend/src/**/*.rs` (inline `#[cfg(test)]`) |
| Backend integration tests | 183 | All passing | `backend/tests/*.rs` |
| Frontend unit tests | 141 | 131 passing, 10 failing | `frontend/src/**/*.test.{ts,tsx}` |
| E2E tests (Playwright) | 7 | All passing | `e2e/tests/*.spec.ts` |
| **Total** | **689** | | |
### Backend Unit Test Breakdown
| Source file | Tests | Coverage area |
|---|---|---|
| `services/scraper.rs` | 74 | SSRF IP checks, soft-404, redirect, HTML parsing |
| `services/synthesis.rs` | 36 | Pipeline logic, schema building, category overflow |
| `services/llm/anthropic.rs` | 20 | Response parsing, error handling |
| `services/prompts.rs` | 18 | Prompt template generation |
| `services/csv.rs` | 18 | CSV parsing, serialization |
| `models/synthesis.rs` | 16 | Model validation, serialization |
| `services/rate_limiter.rs` | 15 | Token bucket, concurrency |
| `services/llm/openai.rs` | 13 | Response parsing, error handling |
| `models/source.rs` | 12 | URL / title validation |
| `models/settings.rs` | 12 | Settings validation, defaults |
| `services/export.rs` | 12 | Markdown / PDF rendering |
| `services/llm/gemini.rs` | 10 | Response parsing, error handling |
| `models/provider.rs` | 10 | Provider / model validation |
| `services/email.rs` | 9 | Email rendering, bypass mode |
| `services/encryption.rs` | 8 | AES-256-GCM encrypt/decrypt |
| `services/source_scraper.rs` | 8 | Link extraction, is_article filter |
| `services/llm/schema.rs` | 8 | JSON schema generation |
| `util/token.rs` | 8 | Token generation, hashing |
| `models/api_key.rs` | 8 | API key validation |
| `middleware/csrf.rs` | 7 | CSRF header check |
| `models/rate_limit.rs` | 6 | Rate limit model validation |
| `config.rs` | 6 | Config parsing |
| `middleware/auth.rs` | 5 | Session extraction |
| `services/llm/factory.rs` | 5 | Provider factory |
| `handlers/admin.rs` | 4 | Admin handler validation |
### Backend Integration Test Breakdown
| File | Tests | Coverage area |
|---|---|---|
| `api_sources_test.rs` | 36 | Sources CRUD, validation, CSV, bulk import, max limit |
| `api_admin_test.rs` | 30 | Provider CRUD, rate limits, user management, audit log |
| `api_keys_test.rs` | 18 | API key CRUD, encryption, ownership, test endpoint |
| `api_syntheses_test.rs` | 17 | Synthesis CRUD, pagination, ownership, generation trigger |
| `api_auth_test.rs` | 16 | Register, login, verify, logout, session |
| `api_export_test.rs` | 13 | Email send, Markdown export, PDF export |
| `api_themes_test.rs` | 10 | Theme CRUD, validation, ownership |
| `api_schedules_test.rs` | 9 | Schedule CRUD, validation, ownership |
| `api_settings_test.rs` | 7 | Settings CRUD, defaults, boundary values |
| `pipeline_test.rs` | 6 | Phase 1 extraction, Phase 2 search, overflow, diversity, dedup, preferred |
| `api_article_history_test.rs` | 4 | History list, clear, provenance |
| `api_csrf_test.rs` | 4 | CSRF header enforcement |
| `api_stop_generation_test.rs` | 4 | Stop job, ownership, 404 |
| `api_llm_logs_test.rs` | 3 | LLM logs auth, 404, happy path |
| `api_sources_preferred_test.rs` | 3 | Preferred sources set/clear/auth |
| `minimal_test.rs` | 2 | Infrastructure sanity |
| `api_health_test.rs` | 1 | Health check |
### E2E Test Breakdown
| File | Coverage area |
|---|---|
| `registration.spec.ts` | Full magic link registration flow |
| `settings.spec.ts` | Settings persistence across reloads |
| `settings-export.spec.ts` | Settings export/import roundtrip |
| `sources.spec.ts` | Source CRUD + preferred sources via API |
| `themes.spec.ts` | Theme CRUD + schedule CRUD via API |
| `admin-providers.spec.ts` | Admin provider management, settings dropdown |
| `generation-live.spec.ts` | Full pipeline with real OpenAI key (gated on `OPENAI_TEST_API_KEY`) |
---
## Running Tests
### Backend Unit Tests
No database required:
```bash
cd backend && cargo test --lib
```
### Backend Integration Tests
Requires a running Postgres instance. Use the helper script:
```bash
./scripts/run-integration-tests.sh # all tests
./scripts/run-integration-tests.sh --test pipeline_test # one test file
./scripts/run-integration-tests.sh --test api_admin_test config_providers # one test by name
./scripts/run-integration-tests.sh --lib # unit tests only
./scripts/run-integration-tests.sh --db-check # just check DB connectivity
```
The script automatically:
- Starts the test Postgres container on port 5433 (via `e2e/docker-compose.test.yml`)
- Sets `TEST_DATABASE_URL` and `SKIP_SSRF_CHECK=1`
- Runs `cargo test` with the specified arguments
Manual equivalent:
```bash
cd e2e && docker compose -f docker-compose.test.yml up -d db
cd ../backend
export TEST_DATABASE_URL=postgres://ai_synth_test:testpassword@127.0.0.1:5433/ai_synth_test
export SKIP_SSRF_CHECK=1
cargo test
```
### Frontend Unit Tests
```bash
cd frontend && npx vitest run
```
Type checking (no tests, just compiler verification):
```bash
cd frontend && npx tsc --noEmit
```
### E2E Tests (Playwright)
Use the helper script, which builds the Docker image, starts the full stack, seeds the database, and runs Playwright:
```bash
./scripts/run-e2e-tests.sh # all E2E tests
./scripts/run-e2e-tests.sh --headed # with browser visible
./scripts/run-e2e-tests.sh generation-live # specific test file
```
The script:
1. Builds the test Docker image (`docker compose -f docker-compose.test.yml build`)
2. Starts the full stack (app + Postgres)
3. Waits for the app health check to pass
4. Installs npm dependencies and Playwright browsers
5. Seeds the test database (`npx tsx seed.ts`)
6. Runs Playwright tests
7. Cleans up on exit (stops containers, removes volumes)
The `generation-live.spec.ts` test requires `OPENAI_TEST_API_KEY` to be set (in `e2e/.env.test` or environment). It exercises the real pipeline with an actual LLM API call.
---
## Test Infrastructure
### TestApp (Backend Integration Tests)
`backend/tests/common/mod.rs` provides the `TestApp` struct, which is the foundation for all integration tests.
**What it does:**
- Creates a unique temporary Postgres database per test (named `ai_synth_test_{uuid}`)
- Runs all migrations
- Builds the full Axum router with test configuration (bypassed Turnstile and Resend)
- Provides request helpers: `get`, `post`, `get_with_session`, `post_with_session`, `put_with_session`, `delete_with_session`, `raw_request_text`, `raw_request_bytes`
- Provides auth helpers: `create_test_user`, `create_authenticated_user`, `create_admin_user`, `register_user_via_api`, `create_magic_link_for_email`
- Provides `insert_test_synthesis` for creating test data without running the pipeline
- Handles cleanup via `Drop` (fire-and-forget) or explicit `cleanup().await`
**Request helpers** automatically:
- Set `Content-Type: application/json` for requests with a body
- Set `X-Requested-With: XMLHttpRequest` (CSRF header) for mutating methods (POST, PUT, DELETE, PATCH)
- Set the session cookie when `session_cookie` is provided
- Parse the response body as JSON (or return `{}` for empty bodies)
**Usage pattern:**
```rust
#[tokio::test]
async fn my_test() {
let app = TestApp::new().await;
let (user_id, session) = app.create_authenticated_user("user@test.com").await;
let (status, body) = app.get_with_session("/api/v1/settings", &session).await;
assert_eq!(status, StatusCode::OK);
// ...assertions...
app.cleanup().await;
}
```
### Wiremock (Pipeline Tests)
Pipeline integration tests use `wiremock` to mock HTTP responses from source websites. The mock server runs on localhost, which is why `SKIP_SSRF_CHECK=1` is required (otherwise the SSRF protection would block requests to localhost).
### MockLlmProvider
`backend/src/services/llm/mock.rs` provides a deterministic mock LLM provider for pipeline tests:
- Returns classify/summarize responses when the system prompt contains "classer" (French for "classify")
- Returns search responses with configurable URLs via `with_search_urls()`
- Uses a configurable default category via `with_default_category()`
- Identifies call types by inspecting French keywords in the system prompt
Usage:
```rust
let mock = MockLlmProvider::new()
.with_default_category("IA")
.with_search_urls(vec!["https://example.com/article".into()])
.into_arc();
```
### E2E Seed Data (seed.ts)
`e2e/seed.ts` creates known test users and sessions in the database. It is idempotent (uses `ON CONFLICT DO NOTHING`):
- **Admin user**: `admin@test.local` with a known session token
- **Regular user**: `user@test.local` with a known session token
- **Gemini provider**: Enabled for the test environment
Session tokens are SHA-256 hashed before insertion (matching the backend's hashing strategy).
### E2E Auth Helpers (auth.ts)
`e2e/helpers/auth.ts` provides:
- **`loginAsAdmin(page)`**: Injects the admin session cookie.
- **`loginAsUser(page)`**: Injects the regular user session cookie.
- **`registerAndVerify(page, email)`**: Full registration flow: calls the API to register, inserts a magic link token directly in the DB, navigates to the verify URL.
- **`createDbClient()`**: Returns a `pg.Client` connected to the test database.
---
## Writing Integration Tests
### Patterns
1. **Each test gets its own `TestApp`** (and therefore its own database). Tests are fully isolated.
2. **Create users via helpers**, not via the registration API (unless testing registration):
```rust
let (user_id, session) = app.create_authenticated_user("user@test.com").await;
```
3. **Test all access control paths** for every endpoint:
- 401 without authentication
- 403 for admin-only endpoints with a regular user
- 404 for accessing another user's resources (ownership isolation)
4. **Settings payload must be complete.** The `PUT /settings` endpoint requires every field. When sending a settings update in tests, include all fields:
```rust
let settings = serde_json::json!({
"max_articles_per_source": 3,
"max_links_per_source": 10,
"use_brave_search": false,
"article_history_days": 30,
"batch_size": 5,
"source_extraction_window": 5,
"search_agent_behavior": "",
"ai_provider": "gemini",
"ai_model": "gemini-2.5-flash",
"ai_model_websearch": "gemini-2.5-flash",
"rate_limit_max_requests": null,
"rate_limit_time_window_seconds": null
});
```
5. **Use `post_without_csrf` to test CSRF rejection.**
6. **Use `raw_request_text` / `raw_request_bytes`** for non-JSON responses (CSV exports, PDF exports).
7. **Always call `app.cleanup().await`** at the end of the test for deterministic cleanup.
### Pipeline Tests
Pipeline integration tests in `pipeline_test.rs` use wiremock + MockLlmProvider:
1. Set up wiremock to serve a mock source page with article links
2. Set up wiremock to serve mock article pages
3. Configure user settings and sources pointing to wiremock URLs
4. Run the pipeline with `MockLlmProvider` via the `provider_override` parameter
5. Assert the resulting synthesis contains the expected categories and articles
---
## Writing E2E Tests
### Playwright Configuration
- Tests run against the Docker-composed stack on `http://localhost:8080`
- Single worker to avoid parallel DB state mutations
- Timeout: 30 seconds per test, 2 retries
- Screenshots on failure, traces on first retry
- Chromium browser only
### Patterns
1. **Use `loginAsAdmin` / `loginAsUser`** from `e2e/helpers/auth.ts` for authentication:
```typescript
import { loginAsUser } from '../helpers/auth';
test('my test', async ({ page }) => {
await loginAsUser(page);
await page.goto('/', { waitUntil: 'domcontentloaded' });
// ...
});
```
2. **Use `waitUntil: 'domcontentloaded'`** instead of the default `load` for `page.goto()`. This avoids waiting for external resources (Turnstile scripts, fonts) that may not load in the test environment.
3. **Prefer API-based setup over UI interactions** for test data. Use `page.evaluate()` to call the API directly:
```typescript
await page.evaluate(async () => {
await fetch('/api/v1/sources', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'X-Requested-With': 'XMLHttpRequest' },
body: JSON.stringify({ title: 'Test', url: 'https://example.com', theme_id: '...' }),
});
});
```
4. **Use `createDbClient()`** from `e2e/helpers/auth.ts` when you need to verify database state directly.
5. **The `generation-live.spec.ts` test** is gated on `OPENAI_TEST_API_KEY`. It exercises the full pipeline including provenance and LLM log verification.
---
## Known Limitations
### Drop Deadlock in TestApp
The `TestApp::Drop` implementation spawns a background thread to drop the test database. **Do not call `.join()` on this thread** -- it deadlocks because the spawned thread creates a new tokio runtime whose `block_on` conflicts with the existing runtime's connection pool. The thread runs independently and cleans up asynchronously. For deterministic cleanup, use `app.cleanup().await`.
### SSRF Bypass for Integration Tests
`SKIP_SSRF_CHECK=1` is set during integration tests so that wiremock (running on localhost) is not blocked by the SSRF protection. This env var check runs at runtime, not compile time. Ensure it is never set in production.
### Flaky generation-live Test
The `generation-live.spec.ts` test depends on a real OpenAI API call. It may fail due to:
- API rate limits
- Slow responses exceeding the 30-second timeout
- Changes in model behavior affecting output format
It is configured with 2 retries to mitigate transient failures.
### Frontend Failing Tests
As of the last audit, 10 of 141 frontend unit tests are failing. Investigate with `cd frontend && npx vitest run` before adding new frontend tests.
---
## Coverage Targets and Gaps
### Well-Covered Areas
- **SSRF protection**: 74 unit tests covering all private IP ranges, IPv4-mapped IPv6, redirect blocking
- **Sources CRUD**: 36 integration tests including CSV, bulk import, max limits
- **Admin module**: 30 integration tests with access control verification
- **Encryption**: Tests verify API keys are not stored in plaintext by querying the database directly
- **Pipeline**: Uses wiremock + MockLlmProvider for deterministic end-to-end pipeline testing
### Critical Gaps
| Gap | Priority | Description |
|-----|----------|-------------|
| Scheduled execution | Critical | `scheduler.rs` has zero tests. Autonomous process that generates syntheses and sends emails. |
| Brave Search pipeline | High | Only 1 unit test. The Brave Search code path in the pipeline is untested in integration. |
| Date filtering | High | No tests verify that `max_age_days` actually filters old articles. |
| Rate limiting integration | High | 15 unit tests but no integration test verifying rate limits are applied during pipeline runs. |
| SSE progress stream | High | No integration test for the SSE endpoint. Only tested in the gated E2E test. |
| Settings validation (negative) | Medium | No tests for rejection of out-of-range values (e.g., `max_articles_per_source: 0`). |
| Article history ownership | Medium | No test verifying User B cannot see User A's article history. |
| Frontend failing tests | Medium | 10 tests need investigation and fixing. |
Loading…
Cancel
Save