ai_synth/docs/team-analysis/03-security-analysis.md

# Security Analysis: AI Weekly Synth Refactoring

**Role**: Security Specialist
**Date**: 2026-03-21
**Scope**: Full security audit of the current application and security architecture for the Rust/SolidJS refactoring

---

## Questions Requiring User Decision

Before implementation begins, the following security-sensitive questions need answers:

1. **Admin bootstrapping**: How will the first admin account be created? Options: (a) CLI command during deployment, (b) first-user-is-admin, (c) environment variable with seed admin email. Option (a) is recommended -- (b) is dangerous in production, (c) leaks info in env vars.

2. **Multi-tenancy scope**: Will there ever be shared syntheses between users (e.g., team workspaces)? This fundamentally affects the authorization model. The current analysis assumes strict per-user isolation.

3. **Self-registration**: Should anyone be able to create an account, or should there be an admin-approval flow or invite-only mechanism? Open registration with captcha is assumed below.

4. **Email provider for magic links**: Will you self-host SMTP (e.g., via Postfix in the Docker stack) or use an external transactional email service (Resend, AWS SES, Mailgun)? This affects DNS configuration (SPF/DKIM/DMARC) and deliverability. External service is recommended.

5. **Master encryption key management**: For encrypting LLM API keys at rest, are you comfortable storing the master key in an environment variable, or do you want to integrate with a KMS (e.g., HashiCorp Vault, AWS KMS)? Environment variable is assumed below for single-VM simplicity.

6. **Rate limiting granularity**: Should LLM API rate limits be global (shared across all users) or per-user? Per-user is recommended, with a global ceiling.

---

## 1. Current Security Issues

### 1.1 CRITICAL: Gemini API Key Exposed in Frontend Bundle

**File**: `/Users/oabrivard/Projects/rust/ai_synth/vite.config.ts` (line 11)
```typescript
'process.env.GEMINI_API_KEY': JSON.stringify(env.GEMINI_API_KEY),
```

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/services/geminiService.ts` (line 4)
```typescript
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
```

The Gemini API key is injected at build time via Vite's `define` and embedded as a string literal in the client-side JavaScript bundle. Anyone loading the page can extract the key from browser DevTools (Sources tab or Network tab) and use it to make arbitrary Gemini API calls at the owner's expense.

**Impact**: Financial abuse (API billing), quota exhaustion, potential data exfiltration if the key grants access to other Google Cloud resources.

**Mitigation in refactoring**: All LLM API calls move to the Rust backend. API keys never leave the server process.

### 1.2 HIGH: Gmail OAuth Token Handling

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/firebase.ts` (lines 19-30)

```typescript
export const getGmailAccessToken = async (): Promise<string | null> => {
  const provider = new GoogleAuthProvider();
  provider.addScope('https://www.googleapis.com/auth/gmail.send');
  // ...
  return credential?.accessToken || null;
};
```

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/pages/SynthesisDetail.tsx` (lines 112-171)

Issues:
- The `gmail.send` OAuth scope grants the ability to send emails as the user. The access token is obtained client-side and used to call the Gmail API directly from the browser.
- Each email send triggers a full `signInWithPopup` flow, re-requesting the `gmail.send` scope. This is disruptive UX but also means the token is short-lived (good). However, the token is held in JavaScript memory and could be intercepted by XSS.
- The email recipient field (line 41) is hardcoded to a specific email: `olivier.abrivard@desjardins.com`. This is PII committed to the repository.

**Mitigation in refactoring**: Email sending should move to the backend. The backend sends emails using its own SMTP credentials, never exposing OAuth tokens to the client.

### 1.3 HIGH: Prompt Injection via User-Controlled Input

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/services/geminiService.ts` (lines 85-100)

User-controlled fields are interpolated directly into LLM prompts without sanitization:
- `settings.theme` (line 87): User-defined string injected into `"Tu es un expert en analyse de l'actualite sur le theme : "${settings.theme}""`
- `settings.searchAgentBehavior` (line 92): Free-text prompt injected verbatim -- this is literally a prompt injection vector by design
- `settings.categories` (line 83): Array of user strings injected as numbered list items
- `customSources[].title` and `customSources[].url` (line 62): Injected as source list

A malicious user could craft `theme`, `searchAgentBehavior`, or category names that override the system prompt behavior, potentially causing the LLM to:
- Ignore safety guidelines
- Generate harmful or misleading content
- Exfiltrate data via crafted URLs in grounding results

**Mitigation in refactoring**: While users inherently need to customize prompts, the backend should:
- Enforce maximum lengths for all user-provided prompt fragments
- Apply a sanitization layer that strips common injection patterns (e.g., "ignore previous instructions", "system:", role-switching patterns)
- Log and monitor unusual prompt patterns
- Use structured prompt templates where user input is clearly delimited as data, not instructions

### 1.4 HIGH: CORS Proxy Data Leakage

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/services/geminiService.ts` (lines 174-213)

Three third-party CORS proxies are used in cascade:
1. `api.allorigins.win`
2. `api.codetabs.com`
3. `corsproxy.io`

Issues:
- **Data exfiltration**: Every URL the user scrapes (their custom sources, AI-generated article URLs) is sent to these third-party services. They can log, modify, or block content.
- **Man-in-the-middle**: The proxied HTML content could be tampered with. If an attacker controls one of these services, they could inject malicious content into scraped pages.
- **Availability**: These are free, community-run services with no SLA. They can disappear at any time.
- **No timeout or size limits**: The `fetch` calls have no explicit timeout or response size limit, potentially causing the browser to hang or consume excessive memory.

**Mitigation in refactoring**: The Rust backend performs HTTP requests directly (no CORS restriction server-side). This eliminates the need for proxies entirely. Add SSRF protections (see Section 5.4).

### 1.5 MEDIUM: Firebase Config Committed to Repository

**File**: `/Users/oabrivard/Projects/rust/ai_synth/firebase-applet-config.json`

Firebase configuration (API key, project ID, app ID) is committed to the repository. While Firebase API keys are designed to be public (they are restricted by Firebase Security Rules and authorized domains), committing them creates a false sense of security and makes key rotation harder.

### 1.6 MEDIUM: Hardcoded PII in Source Code

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/pages/SynthesisDetail.tsx` (line 41)
```typescript
const [email, setEmail] = useState('olivier.abrivard@desjardins.com');
```

A personal corporate email address is hardcoded as the default email recipient. This should not be in source code.

### 1.7 MEDIUM: Client-Side Rate Limiter is Ineffective

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/services/geminiService.ts` (lines 6-33)

The `RateLimiter` class runs in browser memory. It does not protect against:
- Multiple browser tabs
- Multiple users (each has their own in-memory limiter)
- A malicious user who bypasses the frontend entirely and calls the Gemini API directly with the exposed key

**Mitigation in refactoring**: Server-side rate limiting with shared state (see Section 5.3).

### 1.8 LOW: No Content Security Policy

The application serves no CSP headers. Combined with the fact that user-generated content (article titles, summaries, URLs) is rendered in the DOM, this creates potential for XSS if React's default escaping is ever bypassed.

### 1.9 LOW: Error Messages Leak Internal Details

**File**: `/Users/oabrivard/Projects/rust/ai_synth/src/firebase.ts` (lines 68-89)

The `handleFirestoreError` function logs and throws detailed error objects containing `userId`, `email`, `emailVerified`, `tenantId`, and provider information. While this is useful for debugging, these details should not be exposed to the client in production.

---

## 2. Authentication and Session Security

### 2.1 Magic Link Implementation

#### Token Generation
- Use `rand::rngs::OsRng` (Rust's CSPRNG) to generate tokens: 32 bytes of cryptographic randomness, encoded as URL-safe base64 (43 characters).
- Do NOT use UUIDs -- while v4 UUIDs use random bytes, the format is predictable and shorter effective entropy.
- Store a SHA-256 hash of the token in the database, never the token itself. This way, a database breach does not compromise pending magic links.

#### Token Lifecycle
```
User enters email -> Backend generates token -> Stores SHA-256(token) + email + expires_at + used=false
                  -> Sends email with link: https://app.example.com/auth/verify?token=<raw_token>
User clicks link  -> Backend computes SHA-256(submitted_token) -> Looks up in DB
                  -> Validates: not expired, not used, email matches
                  -> Marks token as used=true
                  -> Creates session (see 2.2)
```

#### Token Expiration
- Magic link tokens expire after **15 minutes** (not longer -- the user has their email open).
- Implement a cleanup job (background task or on-request pruning) to delete expired tokens.

#### Single-Use Enforcement
- The `used` boolean column prevents replay attacks.
- Use a database transaction: `UPDATE magic_tokens SET used = true WHERE token_hash = ? AND used = false AND expires_at > NOW()`. If `rows_affected == 0`, the token is invalid.
- This is atomic and race-condition-safe.

#### Email Enumeration Prevention
- The `/auth/magic-link` endpoint MUST return the same response (HTTP 200, same message) regardless of whether the email exists in the database.
- Message: "If an account with this email exists, a login link has been sent."
- If the email is not registered, silently do nothing (no email sent, no error).
- Apply the same timing: if sending an email takes 200ms, add a random delay (100-300ms) when no email is sent, so timing attacks cannot distinguish the two cases.

### 2.2 Session Management

#### Session Cookie Attributes
```
Set-Cookie: session_id=<value>;
  HttpOnly;        # Prevents JavaScript access (XSS mitigation)
  Secure;          # Only sent over HTTPS
  SameSite=Lax;    # Prevents CSRF on cross-origin POST (allows top-level navigation)
  Path=/;          # Available to all paths
  Max-Age=604800;  # 7 days (server-side expiration is authoritative)
```

Why `SameSite=Lax` and not `Strict`: `Strict` would prevent the session cookie from being sent when the user clicks a magic link from their email client (which is a cross-site navigation). Since magic links are the primary auth mechanism, `Lax` is necessary.

#### Session ID Generation
- 32 bytes from `OsRng`, hex-encoded (64 characters) or base64url-encoded (43 characters).
- Store SHA-256(session_id) in the database. The raw session_id is only in the cookie.
- Schema:
  ```sql
  CREATE TABLE sessions (
      id INTEGER PRIMARY KEY AUTOINCREMENT,
      session_hash TEXT NOT NULL UNIQUE,     -- SHA-256(session_id)
      user_id INTEGER NOT NULL REFERENCES users(id),
      created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      expires_at TIMESTAMP NOT NULL,
      last_active_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      ip_address TEXT,
      user_agent TEXT
  );
  CREATE INDEX idx_sessions_hash ON sessions(session_hash);
  CREATE INDEX idx_sessions_user ON sessions(user_id);
  CREATE INDEX idx_sessions_expires ON sessions(expires_at);
  ```

#### Session Expiration and Rotation
- **Absolute expiration**: 7 days from creation. After this, the user must re-authenticate.
- **Idle timeout**: If `last_active_at` is more than 24 hours ago, invalidate the session.
- **Session rotation**: After successful authentication (magic link click), issue a new session ID and invalidate the old one. This prevents session fixation attacks.
- **Sliding window**: Update `last_active_at` on each request, but only write to DB at most once per 5 minutes to avoid excessive writes.

#### Logout and Revocation
- On logout: DELETE the session row from the database and clear the cookie (set `Max-Age=0`).
- Provide "Log out all sessions" functionality: `DELETE FROM sessions WHERE user_id = ?`.
- Admin capability: revoke all sessions for a specific user (for account compromise response).

### 2.3 Captcha for Self-Hosted Deployment

**Recommended**: [mCaptcha](https://mcaptcha.org/) -- fully open-source, self-hostable, proof-of-work based (no third-party dependency). It can run as a sidecar container in the Docker stack.

**Alternative**: [hCaptcha](https://www.hcaptcha.com/) -- privacy-focused, free tier available, but requires an external service call.

**NOT recommended**: Google reCAPTCHA -- contradicts the "remove Google hosting dependencies" requirement.

Captcha should be applied to:
- Account registration (`POST /auth/register`)
- Magic link request (`POST /auth/magic-link`)
- NOT to every login -- rate limiting handles brute-force on session endpoints

#### Complementary Rate Limiting on Auth Endpoints
| Endpoint | Rate Limit | Window | Scope |
|---|---|---|---|
| `POST /auth/register` | 3 requests | 1 hour | Per IP |
| `POST /auth/magic-link` | 5 requests | 15 minutes | Per IP |
| `POST /auth/magic-link` | 3 requests | 1 hour | Per email |
| `POST /auth/verify` | 10 requests | 15 minutes | Per IP |
| `POST /auth/verify` (failed) | 5 failures | 15 minutes | Per IP, then block |

### 2.4 CSRF Protection Strategy

Since the frontend (SolidJS SPA) and backend (Rust API) may be on different origins during development, a robust CSRF strategy is needed.

**Recommended approach: Double-Submit Cookie pattern with SameSite**

1. `SameSite=Lax` on the session cookie provides baseline CSRF protection for non-GET requests from cross-origin sites.
2. For defense-in-depth, implement the Synchronizer Token pattern:
   - On session creation, generate a CSRF token (32 random bytes, hex-encoded).
   - Store it in the session (server-side).
   - Send it to the frontend via a dedicated endpoint (`GET /auth/csrf-token`) or as a response header.
   - The SolidJS app includes it as an `X-CSRF-Token` header on every state-changing request.
   - The backend middleware validates `X-CSRF-Token` header matches the session's CSRF token for all POST/PUT/DELETE requests.

3. Additionally, validate the `Origin` header on state-changing requests. Reject requests where `Origin` does not match the configured `APP_URL`.

### 2.5 Account Enumeration Protection

Beyond magic link (covered in 2.1):
- **Registration**: If an email is already registered, do NOT return "Email already exists." Instead, send an email to the existing address saying "Someone tried to register with your email. If this was you, use the login link instead." Return the same success message to the client.
- **Login (magic link request)**: Same as 2.1 -- identical response regardless of email existence.
- **Error messages**: Never distinguish between "user not found" and "wrong password" (not applicable here since there are no passwords, but important if password auth is ever added).

---

## 3. API Key Storage Security

### 3.1 Encryption at Rest

LLM API keys (Gemini, OpenAI, Anthropic) stored in the database must be encrypted. These are high-value secrets -- a database leak would expose them.

#### Encryption Scheme
- **Algorithm**: AES-256-GCM (authenticated encryption -- provides both confidentiality and integrity).
- **Implementation**: Use the `aes-gcm` crate in Rust.
- **Per-key nonce**: Generate a unique 96-bit (12-byte) nonce for each encryption operation using `OsRng`. Store the nonce alongside the ciphertext.
- **Schema**:
  ```sql
  CREATE TABLE llm_api_keys (
      id INTEGER PRIMARY KEY AUTOINCREMENT,
      provider TEXT NOT NULL,              -- 'google', 'openai', 'anthropic'
      label TEXT NOT NULL,                 -- Human-readable label
      encrypted_key BLOB NOT NULL,         -- AES-256-GCM ciphertext
      nonce BLOB NOT NULL,                 -- 12-byte GCM nonce
      key_prefix TEXT NOT NULL,            -- First 4 chars of the key (for UI display: "sk-pr...")
      created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
      created_by INTEGER NOT NULL REFERENCES users(id),
      is_active BOOLEAN NOT NULL DEFAULT true
  );
  ```

#### Key Derivation
- **Master key**: A 256-bit (32-byte) key derived from a passphrase/secret using Argon2id (via the `argon2` crate).
- **Input**: `MASTER_KEY_SECRET` environment variable (a high-entropy string, minimum 32 characters).
- **Salt**: A fixed, application-specific salt stored in the config (not secret, but must not change). Alternatively, derive the key once and use the raw 32-byte key directly from the environment variable.
- **Simpler alternative**: If `MASTER_KEY_SECRET` is already a 64-character hex string (32 bytes), skip KDF and use it directly. This is acceptable for a single-VM deployment where the env var is properly protected.

#### Master Key Storage
- Store `MASTER_KEY_SECRET` as an environment variable, injected via Docker Compose `env_file` or Docker secrets.
- The `.env` file containing it must have permissions `600` (owner read/write only).
- **NEVER** commit the master key to version control.
- **NEVER** log the master key or the decrypted API keys.
- For key rotation: implement a re-encryption command that reads all keys with the old master key, encrypts with the new one, and writes them back in a transaction.

### 3.2 Access Control

- **View API keys**: Admin-only. The UI displays only the `key_prefix` (e.g., "sk-pr...") and the `label`. The full key is NEVER sent to the frontend.
- **Create/Update API keys**: Admin-only. The key is sent from the admin UI to the backend via HTTPS, encrypted in transit. On the backend, it is immediately encrypted at rest before being stored.
- **Delete API keys**: Admin-only, with confirmation.
- **Use API keys**: The backend decrypts keys in memory only when making LLM API calls. The decrypted key is held in memory for the duration of the API call, then dropped (Rust's ownership model helps here -- the `String` holding the key is dropped when it goes out of scope).
- **Test API keys**: Provide an admin endpoint that attempts a minimal API call (e.g., a simple completion with a tiny prompt) to validate the key works, without exposing the key itself.

### 3.3 Audit Logging

```sql
CREATE TABLE audit_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    user_id INTEGER NOT NULL REFERENCES users(id),
    action TEXT NOT NULL,        -- 'api_key.create', 'api_key.update', 'api_key.delete', 'api_key.view_list', 'api_key.test'
    target_type TEXT NOT NULL,   -- 'llm_api_key'
    target_id INTEGER,
    details TEXT,                -- JSON with non-sensitive context (provider, label, NOT the key)
    ip_address TEXT,
    user_agent TEXT
);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
CREATE INDEX idx_audit_user ON audit_log(user_id);
```

Log every access to the API keys admin module:
- Creating a key: log provider + label
- Updating a key: log which fields changed (but not the old or new key value)
- Deleting a key: log provider + label
- Listing keys: log that the admin viewed the key list
- Testing a key: log provider + result (success/failure)

Audit logs should be append-only. Even admins should not be able to delete audit entries.

---

## 4. Authorization and Data Isolation

### 4.1 Converting Firestore Rules to Rust Middleware

The current Firestore rules implement three core patterns that must be translated.

#### Current Rules (from `/Users/oabrivard/Projects/rust/ai_synth/firestore.rules`):

| Firestore Rule | Rust Equivalent |
|---|---|
| `isAuthenticated()` | Middleware: extract and validate session cookie, reject 401 if invalid |
| `isDocOwner()` / `isOwner(userId)` | Query filter: always include `WHERE user_id = ?` using the authenticated user's ID from the session |
| `uidUnchanged()` / `uidNotModified()` | Business logic: set `user_id` from the session on create; reject requests that attempt to change `user_id` on update |
| `isValidSynthesis()`, `isValidSettings()`, `isValidSource()` | Request body validation using `serde` deserialization with `#[validate]` derive (via the `validator` crate) |
| Field size/type constraints | `validator` attributes: `#[validate(length(min = 1, max = 200))]`, `#[validate(range(min = 1, max = 365))]`, etc. |

#### Recommended Middleware Stack (using Axum)

```
Request
  -> CORS middleware (tower-http)
  -> Rate limiting middleware (tower::limit or custom)
  -> Session extraction middleware (reads cookie, validates session, injects AuthUser into request extensions)
  -> CSRF validation middleware (for POST/PUT/DELETE)
  -> Route handler
    -> Request body validation (serde + validator)
    -> Business logic (always scopes queries to authenticated user)
  -> Response
```

#### Authentication Extractor Pattern
Define an `AuthUser` extractor that:
1. Reads the `session_id` cookie.
2. Looks up `SHA-256(session_id)` in the `sessions` table.
3. Validates expiration.
4. Returns the user record or rejects with 401.

```rust
// Pseudo-code for the extractor
struct AuthUser {
    id: i64,
    email: String,
    is_admin: bool,
}
```

All route handlers that require authentication take `AuthUser` as a parameter. If the session is invalid, Axum automatically returns 401 before the handler runs.

### 4.2 Multi-Tenant Data Isolation

**Principle**: The user ID from the session is the ONLY source of truth for data ownership. Never trust a `user_id` from the request body or URL parameters for ownership decisions.

**Implementation**:
- Every data table (`syntheses`, `sources`, `settings`) has a `user_id` column with a foreign key to `users(id)`.
- Every SELECT query includes `WHERE user_id = $1` using the authenticated user's ID.
- Every INSERT sets `user_id` from the session, ignoring any `user_id` in the request body.
- Every UPDATE/DELETE query includes `WHERE id = $1 AND user_id = $2` -- if 0 rows affected, return 404 (not 403, to avoid revealing that the resource exists for another user).
- Create a database index on `(user_id, created_at)` for every table to ensure efficient queries.

### 4.3 Admin Role

#### Definition
- A boolean `is_admin` column on the `users` table.
- The first admin is bootstrapped via a CLI command or migration (see Questions section).

#### Protection
- Admin-only endpoints use an `AdminUser` extractor that extends `AuthUser` with an additional `is_admin == true` check. Returns 403 if not admin.
- Admin endpoints:
  - `GET/POST/PUT/DELETE /admin/api-keys` -- LLM API key management
  - `GET/PUT /admin/rate-limits` -- Rate limiter configuration
  - `GET /admin/audit-log` -- View audit logs
  - `POST /admin/users/:id/revoke-sessions` -- Revoke all sessions for a user
- Admin actions are always logged to the audit log.
- Consider requiring re-authentication (e.g., a fresh magic link) for sensitive admin operations like API key changes.

### 4.4 Input Validation

#### Request Body Validation
Use `serde` for deserialization and the `validator` crate for constraint validation. Example:

```rust
#[derive(Deserialize, Validate)]
struct CreateSource {
    #[validate(length(min = 1, max = 200))]
    title: String,

    #[validate(url, length(max = 1000))]
    url: String,
}
```

Reject invalid requests with 400 and a generic error message (do not echo back the invalid input to prevent reflected XSS in error responses).

#### SQL Injection Prevention
- **sqlx with parameterized queries**: sqlx compiles queries at build time (with `query!` / `query_as!` macros) and uses prepared statements. This eliminates SQL injection by design.
- NEVER use string formatting/interpolation to build SQL queries.
- For dynamic queries (e.g., sorting, filtering), use an allowlist of valid column names, not user input.

#### XSS Prevention
- The SolidJS frontend handles escaping by default (like React, it escapes strings rendered in JSX).
- NEVER use `innerHTML` or SolidJS's equivalent (`innerHTML` prop) with user-generated content.
- Article titles, summaries, and URLs from the LLM should be treated as untrusted user input -- the LLM could generate malicious content.
- URLs rendered as `<a href>` must be validated: only allow `http://` and `https://` schemes. Block `javascript:`, `data:`, `vbscript:` schemes.
- Set `Content-Type: application/json` on all API responses (never `text/html` for API endpoints).

---

## 5. Backend Security

### 5.1 Rust-Specific Security Considerations

Rust provides significant security advantages:
- **Memory safety**: No buffer overflows, use-after-free, or data races (without `unsafe`).
- **No null pointer dereferences**: `Option<T>` forces explicit handling.
- **Ownership model**: Secrets (API keys, session tokens) are dropped when they go out of scope, reducing the window of exposure.

Recommendations:
- **Minimize `unsafe` blocks**: Audit any `unsafe` code carefully. Prefer safe abstractions.
- **Dependency auditing**: Run `cargo audit` in CI to check for known vulnerabilities in dependencies.
- **Use `secrecy` crate**: Wrap sensitive values (API keys, session tokens) in `Secret<String>` to prevent accidental logging via `Debug` or `Display` trait implementations.
- **Zeroize secrets**: Use the `zeroize` crate to overwrite sensitive memory on drop (the `secrecy` crate integrates with this).

### 5.2 Recommended Crate Ecosystem

| Concern | Crate | Notes |
|---|---|---|
| Web framework | `axum` | Tower-based, async, well-maintained |
| Database | `sqlx` | Compile-time checked queries, async |
| Password/KDF | `argon2` | For master key derivation |
| Encryption | `aes-gcm` | AES-256-GCM authenticated encryption |
| Random | `rand` | `OsRng` for cryptographic randomness |
| Hashing | `sha2` | SHA-256 for token hashing |
| Secrets | `secrecy` + `zeroize` | Prevent accidental exposure |
| Validation | `validator` | Derive-based request validation |
| Rate limiting | `tower` + `governor` | Token-bucket rate limiting as middleware |
| CORS | `tower-http` | `CorsLayer` |
| HTTP client | `reqwest` | For LLM API calls and URL scraping |
| Serialization | `serde` + `serde_json` | Request/response serialization |
| Logging | `tracing` + `tracing-subscriber` | Structured logging (filter sensitive fields) |
| Email | `lettre` | SMTP client for magic links |

### 5.3 Rate Limiting Implementation

Use `tower::ServiceBuilder` with the `governor` crate for token-bucket rate limiting.

#### Rate Limiting Layers

1. **Global layer** (outermost): Protects the server from DDoS. Example: 1000 requests/minute total.
2. **Per-IP layer**: Prevents abuse from a single source. Example: 100 requests/minute per IP.
3. **Per-user layer** (after authentication): Prevents abuse by authenticated users. Example: 60 requests/minute per user.
4. **Per-endpoint layer**: Specific limits for expensive operations.

| Endpoint | Limit | Window | Note |
|---|---|---|---|
| `POST /api/syntheses/generate` | 3 | 1 hour per user | LLM calls are expensive |
| `POST /auth/*` | See Section 2.3 | | Auth endpoints |
| `GET /api/*` | 120 | 1 minute per user | General API |
| `POST/PUT/DELETE /api/*` | 30 | 1 minute per user | Write operations |

#### Admin-Configurable Rate Limits
Store rate limit configuration in the database:
```sql
CREATE TABLE rate_limit_config (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    endpoint_pattern TEXT NOT NULL UNIQUE,   -- e.g., 'generate', 'auth', 'api_read', 'api_write'
    max_requests INTEGER NOT NULL,
    window_seconds INTEGER NOT NULL,
    scope TEXT NOT NULL DEFAULT 'per_user',  -- 'global', 'per_ip', 'per_user'
    updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    updated_by INTEGER REFERENCES users(id)
);
```

The middleware reads this config on startup and reloads periodically (e.g., every 60 seconds) or via an admin trigger.

### 5.4 SSRF Prevention for URL Scraping

When the backend scrapes URLs (to validate and extract content from news articles), it becomes a potential SSRF vector. A malicious user could add a source URL pointing to internal services.

#### Protections

1. **DNS resolution check**: Before connecting, resolve the hostname and reject if the IP is:
   - Private ranges: `10.0.0.0/8`, `172.16.0.0/12`, `192.168.0.0/16`
   - Loopback: `127.0.0.0/8`, `::1`
   - Link-local: `169.254.0.0/16`, `fe80::/10`
   - Cloud metadata: `169.254.169.254` (AWS/GCP/Azure metadata endpoint)
   - Localhost variants: `0.0.0.0`, `[::0]`

2. **Protocol restriction**: Only allow `http://` and `https://` schemes. Block `file://`, `ftp://`, `gopher://`, `dict://`, etc.

3. **Timeouts**: Set aggressive timeouts on the `reqwest` client:
   - Connection timeout: 5 seconds
   - Response timeout: 15 seconds
   - Total request timeout: 30 seconds

4. **Response size limit**: Maximum 5 MB response body. Use `reqwest`'s `.bytes()` with a streaming check, or set `content-length` limits.

5. **Redirect limit**: Maximum 3 redirects. Validate each redirect destination against the same IP blocklist.

6. **User-Agent**: Set a custom `User-Agent` header identifying the application (e.g., `AI-Weekly-Synth/1.0 (URL Validator)`). This is courteous and allows target sites to identify the bot.

#### Implementation Pattern (Rust pseudo-code)
```rust
fn is_safe_url(url: &Url) -> Result<(), SsrfError> {
    // 1. Check scheme
    if url.scheme() != "http" && url.scheme() != "https" {
        return Err(SsrfError::UnsafeScheme);
    }
    // 2. Resolve DNS and check IP
    let addrs = url.socket_addrs(|| Some(443))?;
    for addr in &addrs {
        if is_private_ip(addr.ip()) {
            return Err(SsrfError::PrivateIp);
        }
    }
    Ok(())
}
```

### 5.5 Content Security Policy

Set the following CSP headers on the HTML response that serves the SPA:

```
Content-Security-Policy:
  default-src 'none';
  script-src 'self';
  style-src 'self' 'unsafe-inline';     # Tailwind may need inline styles
  img-src 'self' data: https:;          # Allow images from HTTPS sources
  font-src 'self';
  connect-src 'self';                    # API calls only to same origin
  frame-src 'none';                      # No iframes
  base-uri 'self';
  form-action 'self';
  frame-ancestors 'none';               # Prevent clickjacking
```

Additional headers:
```
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: camera=(), microphone=(), geolocation=()
Strict-Transport-Security: max-age=31536000; includeSubDomains    # Only if HTTPS
```

### 5.6 CORS Configuration

```rust
let cors = CorsLayer::new()
    .allow_origin(AllowOrigin::exact(app_url.parse().unwrap()))  // Only the frontend origin
    .allow_methods([Method::GET, Method::POST, Method::PUT, Method::DELETE])
    .allow_headers([
        header::CONTENT_TYPE,
        header::AUTHORIZATION,
        HeaderName::from_static("x-csrf-token"),
    ])
    .allow_credentials(true)    // Required for cookies
    .max_age(Duration::from_secs(3600));
```

Key points:
- **Never** use `AllowOrigin::any()` with `allow_credentials(true)` -- browsers reject this combination.
- The allowed origin must match exactly (including scheme and port).
- In development, allow `http://localhost:3000`; in production, only the deployment URL.
- Read `APP_URL` from environment to configure this dynamically.

---

## 6. Deployment Security

### 6.1 Docker Security

#### Dockerfile Best Practices
```dockerfile
# Multi-stage build
FROM rust:1.78-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release

FROM debian:bookworm-slim
# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser -d /app -s /sbin/nologin appuser
# Install only required runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends ca-certificates && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY --from=builder /app/target/release/ai-weekly-synth .
COPY --from=builder /app/static ./static
# Own the data directory
RUN mkdir -p /app/data && chown -R appuser:appuser /app
USER appuser
EXPOSE 8080
CMD ["./ai-weekly-synth"]
```

Checklist:
- [x] Non-root user (`USER appuser`)
- [x] Minimal base image (`debian:bookworm-slim`, not `ubuntu` or full `debian`)
- [x] Multi-stage build (no compiler, source code, or build artifacts in final image)
- [x] No secrets in the image (API keys, master key are injected via env vars at runtime)
- [x] `.dockerignore` excludes `.env`, `.git`, `target/`, `node_modules/`
- [x] Pin base image versions for reproducibility
- [x] `ca-certificates` installed for HTTPS requests to LLM APIs

#### Docker Compose Security
```yaml
services:
  app:
    # ...
    env_file: .env                  # Contains MASTER_KEY_SECRET, DATABASE_URL, etc.
    read_only: true                 # Read-only root filesystem
    tmpfs:
      - /tmp                        # Writable temp directory
    volumes:
      - ./data:/app/data            # SQLite database (persistent)
    security_opt:
      - no-new-privileges:true      # Prevent privilege escalation
    cap_drop:
      - ALL                         # Drop all Linux capabilities
```

### 6.2 SQLite File Permissions

- The SQLite database file should be owned by the `appuser` and have permissions `600` (owner read/write only).
- The directory containing the SQLite file should have permissions `700`.
- Enable WAL mode for concurrent reads: `PRAGMA journal_mode=WAL;`
- Set `PRAGMA foreign_keys = ON;` at connection startup.
- Consider `PRAGMA secure_delete = ON;` to overwrite deleted data (relevant for API keys).

### 6.3 HTTPS/TLS Termination

**Do NOT terminate TLS in the Rust application**. Use a reverse proxy:

**Recommended: Caddy** (automatic HTTPS with Let's Encrypt, zero-config)

```
# Caddyfile
app.example.com {
    reverse_proxy app:8080
    encode gzip
    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains"
        X-Content-Type-Options "nosniff"
        X-Frame-Options "DENY"
    }
}
```

Alternative: nginx with certbot for Let's Encrypt.

Add Caddy as a service in Docker Compose:
```yaml
services:
  caddy:
    image: caddy:2-alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
      - caddy_data:/data
      - caddy_config:/config
    depends_on:
      - app
```

The Rust app only listens on `0.0.0.0:8080` (internal Docker network, not exposed to the host).

### 6.4 Environment Variable Management

Required environment variables:
| Variable | Description | Example |
|---|---|---|
| `MASTER_KEY_SECRET` | 256-bit key for encrypting LLM API keys | 64-char hex string |
| `DATABASE_URL` | SQLite path | `sqlite:///app/data/ai_synth.db` |
| `APP_URL` | Public URL of the application | `https://app.example.com` |
| `SMTP_HOST` | SMTP server for magic link emails | `smtp.resend.com` |
| `SMTP_PORT` | SMTP port | `465` |
| `SMTP_USERNAME` | SMTP username | `resend` |
| `SMTP_PASSWORD` | SMTP password/API key | `re_...` |
| `SMTP_FROM` | Sender email address | `noreply@example.com` |
| `MCAPTCHA_URL` | mCaptcha service URL (if used) | `http://mcaptcha:7000` |
| `MCAPTCHA_SECRET` | mCaptcha site key | `...` |
| `RUST_LOG` | Log level | `info,ai_weekly_synth=debug` |

Storage:
- `.env` file with permissions `600`, excluded from version control.
- For Docker: use `env_file` directive, or Docker secrets for Swarm deployments.
- NEVER pass secrets as command-line arguments (visible in `ps` output).
- NEVER use `docker run -e SECRET=value` (visible in `docker inspect`).

### 6.5 Backup Strategy

#### SQLite Backup
- Use SQLite's `.backup` command or the `sqlite3_backup_*` API for consistent hot backups.
- **Do NOT** simply copy the SQLite file while the application is running -- this can result in a corrupted backup.
- Schedule backups via a cron job or a background task in the application:
  ```
  sqlite3 /app/data/ai_synth.db ".backup /app/data/backups/ai_synth_$(date +%Y%m%d_%H%M%S).db"
  ```
- Retain backups for 30 days, with daily rotation.
- Encrypt backups before storing them off-site (if applicable).
- Test restore procedure periodically.

#### Postgres Upgrade Path
When migrating to Postgres:
- Use `pg_dump` for logical backups.
- Consider point-in-time recovery with WAL archiving for production.
- Connection string should use SSL (`sslmode=require`).

---

## 7. Threat Model Summary

| # | Threat | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| T1 | **LLM API key theft from database** -- An attacker gains read access to the SQLite database (file access, SQL injection, backup leak) and extracts LLM API keys | Medium | High (financial abuse, quota exhaustion) | Encrypt API keys at rest with AES-256-GCM (Section 3.1). Use parameterized queries to prevent SQL injection. Restrict SQLite file permissions. Encrypt backups. |
| T2 | **Session hijacking** -- An attacker steals a session cookie via XSS, network sniffing, or physical access | Medium | High (full account takeover) | HttpOnly + Secure + SameSite cookies (Section 2.2). Enforce HTTPS via Caddy. Implement CSP to mitigate XSS (Section 5.5). Session expiration and idle timeout. |
| T3 | **Prompt injection via user settings** -- A user crafts malicious theme/category/behavior text to manipulate the LLM into producing harmful output or leaking system prompt details | High (easy to attempt) | Medium (misleading content, potential data leak via grounding) | Validate and sanitize user prompt inputs (Section 1.3). Enforce length limits. Log unusual patterns. Keep system instructions and user input structurally separated in the prompt. |
| T4 | **SSRF via custom source URLs** -- A user adds a source URL pointing to internal infrastructure (`http://169.254.169.254/`, `http://localhost:8080/admin/api-keys`) | Medium | High (internal network access, credential theft from cloud metadata) | IP blocklist, scheme restriction, DNS validation before connecting (Section 5.4). Timeouts and response size limits. |
| T5 | **Account takeover via magic link interception** -- An attacker intercepts a magic link email (compromised email account, network sniffing, email forwarding rules) | Low-Medium | High (full account takeover) | Short token expiration (15 min), single-use tokens, session rotation on login (Section 2.1). Users should secure their email accounts (out of scope but documentable). |
| T6 | **Brute-force on authentication endpoints** -- An attacker attempts to guess magic link tokens or flood registration/login endpoints | Medium | Medium (denial of service, account enumeration) | Rate limiting on auth endpoints (Section 2.3). Captcha on registration and magic link requests. Cryptographically random 256-bit tokens make guessing infeasible. |
| T7 | **Cross-Site Request Forgery (CSRF)** -- An attacker tricks an authenticated user into making unintended API calls (e.g., delete all syntheses, change settings) | Low (mitigated by SameSite) | Medium (data loss, settings manipulation) | SameSite=Lax cookies + CSRF token header + Origin validation (Section 2.4). |
| T8 | **Admin privilege escalation** -- A regular user finds a way to access admin endpoints (direct URL access, manipulated request) | Low | Critical (LLM API key exposure, rate limit removal, user session revocation) | Server-side admin check via `AdminUser` extractor (Section 4.3). No client-side-only admin checks. Audit logging of all admin actions. |
| T9 | **Denial of service via expensive LLM operations** -- A user triggers many concurrent synthesis generations, exhausting LLM API quota or server resources | Medium | Medium (service degradation for all users, financial impact) | Per-user rate limiting on generation endpoint (Section 5.3). Queue-based generation with concurrency limits. Admin-configurable rate limits. |
| T10 | **XSS via LLM-generated content** -- The LLM produces article titles or summaries containing HTML/JavaScript that gets rendered unsafely in the SolidJS frontend | Low (frameworks escape by default) | High (session theft, data exfiltration) | SolidJS default escaping. Never use `innerHTML` with LLM output. CSP headers as defense-in-depth. Validate URLs (scheme allowlist). Sanitize HTML if rich text is ever needed. |

---

## Summary of Key Architectural Decisions

1. **All LLM API calls on the backend**: Eliminates the most critical current vulnerability (exposed API key).
2. **Session-based auth with secure cookies**: Replaces Firebase Auth. Simpler, no third-party dependency, full control.
3. **AES-256-GCM encryption for API keys at rest**: Protects against database leaks.
4. **SSRF-safe URL scraping**: Backend replaces CORS proxies with direct HTTP requests + IP blocklist.
5. **Defense-in-depth**: Multiple layers (CSP, CSRF tokens, SameSite cookies, rate limiting, input validation) rather than relying on any single control.
6. **Audit logging**: All admin and security-relevant actions are logged for incident response.
7. **Self-hosted captcha (mCaptcha)**: Aligns with the "no external dependencies" philosophy.
8. **Caddy for TLS termination**: Automatic HTTPS, minimal configuration, runs in Docker.