ai_synth/docs/audit/rust-expert-report.md

# Rust Backend Audit Report

**Date:** 2026-03-26
**Scope:** `backend/src/` -- all services, handlers, middleware, models, DB layer, and integration tests.
**Rust edition:** 2021 | **Key crates:** axum 0.8, sqlx 0.8, tokio 1, reqwest 0.12, aes-gcm 0.10, dashmap 6

---

## Executive Summary

The codebase is remarkably well-structured for a learning project. It demonstrates strong idiomatic Rust across most dimensions: proper error propagation with `?`, thoughtful enum design (`AppError`), `Arc`-based shared state, and zero `unsafe` code. The architecture cleanly separates concerns (handlers / services / db / models), and the test suite covers all critical paths.

That said, the audit identifies **three categories of findings**:

1. **Code quality issues** (moderate priority): Significant code duplication in the synthesis pipeline, a potential panic in `sanitize_error_message`, and unnecessary allocations in hot paths.
2. **Design opportunities** (lower priority): Missing newtype patterns, opportunities to reduce `Clone` overhead via `Arc<str>`, and some functions with too many parameters that would benefit from builder/struct patterns.
3. **Concurrency subtlety** (low probability, worth noting): A TOCTOU race in `JobStore::create_job` between checking for existing jobs and inserting a new one.

No critical security issues were found. SQL injection is comprehensively prevented via sqlx parameterized queries. Secrets are never exposed in error responses. SSRF protection is thorough with a documented residual risk.

---

## Idiomatic Rust

### Ownership & Borrowing

**Strengths:**
- `MasterKey` intentionally omits `Clone` to prevent key material duplication -- an excellent security-conscious design (`encryption.rs:18`).
- Functions consistently accept `&str` / `&[u8]` over owned types where possible (e.g., `encrypt(master_key: &MasterKey, plaintext: &str)`).
- The `AuthUser` extractor cleanly borrows from request parts and clones only the necessary fields.

**Findings:**

1. **`AppConfig` holds secrets as `String` and derives `Clone`** (`config.rs:9-10`). Every `AppState` clone copies the `master_encryption_key` and `session_secret` in full. Wrapping `AppConfig` in `Arc<AppConfig>` would eliminate these copies and reduce the blast radius of the key material in memory.

2. **`extract_session_token` allocates a `format!` string on every cookie segment** (`middleware/auth.rs:40-41`). The two `format!("{}=", cookie_name)` calls create heap-allocated strings for every iteration. These could be computed once before the iterator chain:
   ```rust
   let prefix = format!("{}=", cookie_name);
   cookie_header.split(';').map(|s| s.trim())
       .find(|s| s.starts_with(&prefix))
       .and_then(|s| s.strip_prefix(&prefix))
       .map(|s| s.to_string())
   ```

3. **Excessive cloning in synthesis pipeline** (`synthesis.rs`, ~59 `.clone()` calls). Many of these are necessary for spawned tasks, but several could be avoided:
   - `model_research.clone()` is cloned into every spawned task. Wrapping it in `Arc<str>` would make clones zero-cost.
   - `classification_categories.clone()` clones a `Vec<String>` per batch. `Arc<Vec<String>>` would be preferable.
   - `classify_schema.clone()` clones a `serde_json::Value` per article. As an immutable value, this should be `Arc<Value>`.

### Error Handling

**Strengths:**
- `AppError` is a clean, well-designed enum with `thiserror` derivation and `IntoResponse` implementation.
- Internal errors are properly logged but never exposed to clients (`errors.rs:66-73`).
- The `From<sqlx::Error>` and `From<anyhow::Error>` conversions are correct and idiomatic.
- `.ok()` is used appropriately for fire-and-forget operations like logging (`synthesis.rs:953`).

**Findings:**

4. **`msg[..200]` in `sanitize_error_message` can panic on non-ASCII** (`synthesis.rs:1313`). If the error message contains multi-byte UTF-8 characters and the 200th byte falls in the middle of a character, this will panic at runtime. Fix: use `msg.chars().take(200).collect::<String>()` or `msg.char_indices()` to find a safe boundary.

5. **`Selector::parse("a[href]").unwrap()` in production code** (`source_scraper.rs:73,130`). While `Selector::parse` with a static, known-valid selector string will never fail in practice, it would be more idiomatic to use `lazy_static!` or `std::sync::LazyLock` to parse the selector once and reuse it. This also avoids repeated parsing overhead.

6. **`unwrap_or("Unknown error")` in error mappers** (`gemini.rs:156`, `openai.rs:163`, `anthropic.rs:199`). These are safe because the value is only used for logging, but using `unwrap_or_default()` would be slightly more consistent with the rest of the codebase.

### Iterator Patterns

**Strengths:**
- Excellent use of iterator chains throughout: `filter_map`, `find_map`, `chain`, `collect`, `take`, `any`, `position`.
- `extract_links_from_html` and `extract_links_as_pairs` are clean, functional-style implementations.
- `sanitize_json_null_bytes` is an elegant recursive function using `into_iter().map().collect()`.

**Findings:**

7. **`insert_urls` in `article_history.rs:78-87` executes individual INSERT queries in a loop.** For bulk inserts, this should use a single query with `UNNEST` or build a multi-row INSERT, which would be both faster and more idiomatic:
   ```rust
   // Instead of N individual queries:
   sqlx::query("INSERT INTO ... SELECT * FROM UNNEST($1::uuid[], $2::text[], $3::text[]) ...")
   ```

### Trait Usage

**Strengths:**
- `LlmProvider` trait (`llm/mod.rs:22-39`) is well-designed: `async_trait` is correctly applied, `Send + Sync` bounds enable use behind `Arc<dyn LlmProvider>`.
- `FromRequestParts` implementation for `AuthUser` and `AdminUser` is textbook Axum usage.
- `TryFrom<SettingsRow> for UserSettings` cleanly separates DB row types from domain types.

**Findings:**

8. **`async_trait` could be removed in favor of native async traits.** Since Rust 1.75+ (stable), async traits work natively for `dyn`-compatible traits. The `async_trait` crate adds hidden `Box::pin` allocations on every call. However, the current edition is 2021, and `async_trait` is still the pragmatic choice for trait objects in some contexts. Worth revisiting when upgrading to edition 2024.

---

## Async Patterns

### Concurrency

**Strengths:**
- `JoinSet` is used correctly for bounded concurrent scraping (`synthesis.rs:296-372`). The manual "seed initial + spawn on completion" pattern correctly limits concurrency to 5.
- `tokio::sync::watch` is an excellent choice for SSE progress reporting -- exactly the right channel type for "latest value" semantics.
- The `_rx` field in `JobEntry` (`synthesis.rs:71`) is a smart pattern: keeping a receiver alive prevents `send()` from returning an error.

**Findings:**

9. **TOCTOU race in `JobStore::create_job`** (`synthesis.rs:108-145`). The function iterates all entries to check for existing active jobs, then inserts a new entry. Between the check and the insert, another thread could create a job for the same user. While unlikely (generation is user-initiated and the check in `trigger_generate` provides a first-line defense), the `DashMap` entry API could enforce atomicity:
   ```rust
   // Atomic check-and-insert would prevent the race
   ```
   In practice this is low-risk because the handler also checks `has_active_job` before calling `create_job`, and the window is microseconds.

10. **Spawned tasks silently swallow panics** (`generation.rs:73-75`). The `tokio::spawn` for `run_generation` does not handle `JoinError::Panic`. If the generation pipeline panics (e.g., due to finding #4 above), the spawned task silently dies, and the SSE stream never sends a completion/error event. The client would hang indefinitely. Consider:
    ```rust
    tokio::spawn(async move {
        let result = tokio::spawn(synthesis::run_generation(...)).await;
        if let Err(join_error) = result {
            tx.send(ProgressEvent::Error { message: "Internal error".into() }).ok();
        }
    });
    ```

### Error Propagation in Spawned Tasks

**Strengths:**
- `run_generation` wraps `run_generation_inner` cleanly, converting `Result` into progress events (`synthesis.rs:213-230`).
- Error sanitization prevents API key leakage in SSE error events (`synthesis.rs:1291-1317`).

**Findings:**

11. **`check_rate_limit` uses a busy-wait retry loop with `tokio::time::sleep`** (`synthesis.rs:1029-1055`). While the sleep prevents CPU spinning, the `check()` call *consumes* a rate limit slot even when the request is denied. On each loop iteration, `check()` is called, which records a timestamp if allowed but also modifies state. This means that repeated denied checks do not consume slots (only successful ones do), so this is actually correct -- but the code comment could be clearer about why `check()` is safe to call in a loop.

### Channel Usage

**Strengths:**
- `watch::channel` is the ideal choice for progress events: multiple subscribers, always the latest value, no backpressure needed.
- `WatchStream` adapter in the SSE handler (`generation.rs:113`) correctly bridges the watch channel to an SSE stream.
- Keepalive pings every 15s prevent proxy timeouts (`generation.rs:126`).

---

## Type System

### Missing Newtype Opportunities

12. **User IDs, Job IDs, and Synthesis IDs are all `Uuid`** throughout the codebase. A newtype pattern would prevent accidentally passing a `user_id` where a `job_id` is expected:
    ```rust
    pub struct UserId(pub Uuid);
    pub struct JobId(pub Uuid);
    pub struct SynthesisId(pub Uuid);
    ```
    The `trace_article` function takes 3 `Uuid` parameters (`synthesis.rs:927-930`), making it easy to swap `user_id` and `job_id` silently.

13. **Provider names are bare `String`/`&str`** everywhere. A `ProviderName` enum (`Gemini | OpenAI | Anthropic`) would eliminate the catch-all `_ =>` branches in match statements and make invalid provider names a compile-time error rather than a runtime error.

### Type Safety Gaps

14. **`RateLimitConfigRow` casts `i32` to `u32`/`u64` without validation** (`rate_limiter.rs:188-189`):
    ```rust
    max_requests: row.max_requests as u32,
    time_window: Duration::from_secs(row.time_window_seconds as u64),
    ```
    Negative values from the database would silently wrap to large positive values. A `TryFrom` or explicit validation would be safer.

15. **`UserRateLimitEntry::new` casts `i32` to `usize`/`u64`** (`app_state.rs:46-48`) with the same wrapping risk for negative values.

---

## Memory & Performance

### Unnecessary Allocations

16. **`format!("category_{}", i)` is computed repeatedly** in inner loops (`synthesis.rs:514-518, 720-725`). These category keys are deterministic and could be pre-computed once into a `Vec<String>` before the main loop.

17. **`body_text.chars().take(500).collect::<String>()`** (`synthesis.rs:467,676`). This allocates a new string for each article. Using `&body_text[..find_char_boundary(body_text, 500)]` would avoid the allocation entirely (with a helper to find a safe UTF-8 boundary).

18. **`hash_article_url` recomputes the hash multiple times for the same URL** during history filtering (`synthesis.rs:377-386`). The hash is computed once in line 377 for the batch, then again in line 381 for each filtered URL. This is redundant -- the batch result could use a `HashMap<String, String>` (hash -> url) for O(1) lookups.

19. **`Selector::parse` is called per-invocation** in `extract_page_title`, `detect_soft_404`, `detect_canonical_error`, etc. (`scraper.rs:325-410`). These CSS selectors are constants that should be parsed once and cached. Using `std::sync::LazyLock` (stable in Rust 1.80) or `once_cell::sync::Lazy`:
    ```rust
    static TITLE_SEL: LazyLock<Selector> = LazyLock::new(|| Selector::parse("title").unwrap());
    ```

### Clone Efficiency

20. **`AppConfig` is cloned when creating `AppState`** (`main.rs:59`), copying all `String` fields including the `master_encryption_key`. Since `AppConfig` is effectively immutable after startup, wrapping it in `Arc` would be more appropriate.

21. **`settings.categories.clone()` at `synthesis.rs:261`** clones a `Vec<String>` that is only used to build `classification_categories`. This could use a reference or be constructed more efficiently:
    ```rust
    let classification_categories: Vec<&str> = settings.categories.iter()
        .map(|s| s.as_str())
        .chain(std::iter::once("Autre"))
        .collect();
    ```
    (Though this would require adjusting downstream code that expects owned strings.)

---

## Safety Assessment

### Unwrap Usage

**No `unwrap()` calls exist in production code paths** (non-test, non-`cfg(test)` code), with two exceptions:

22. **`Selector::parse("a[href]").unwrap()`** in `source_scraper.rs:73,130` -- These are static selectors that will never fail. Acceptable but could use `LazyLock` for clarity.

23. **`unwrap_or("")` and `unwrap_or("Unknown error")`** in LLM error mappers -- These are safe fallback patterns.

**`expect()` usage** is limited to:
- `main.rs:122,128` -- signal handler installation (correct; failure here is unrecoverable)
- `auth.rs:47,104` -- `TimeDelta::try_minutes/try_days` with constant values (correct; these are compile-time-knowable values)

### Input Validation

**Strengths:**
- Email validation via the `email_address` crate.
- Turnstile captcha verification on auth endpoints.
- CSRF protection via `X-Requested-With` header check.
- URL scheme validation and SSRF prevention with comprehensive IPv4/IPv6 coverage.
- API key prefix extraction prevents full key exposure.
- Body size limits on scraper responses (`MAX_BODY_SIZE = 5MB`).

**Findings:**

24. **No input length validation on LLM prompts.** The `build_search_prompt` and `build_article_classify_prompt` functions accept arbitrary-length category names and theme strings from user settings. An extremely long theme string could lead to excessive LLM token usage. Consider capping `settings.theme` length at the handler/validation layer.

25. **`source_url` is not validated against SSRF before fetching in `source_scraper.rs:41-48`.** The `extract_article_links` function uses the shared `http_client` (which is the scraper client with SSRF-safe redirect handling), but the initial DNS resolution check (`check_ssrf`) is not performed before the GET request. The redirect policy catches private IPs on redirects, but the initial request to a private IP would succeed. The `extract_article_links_with_llm` function has the same issue.

### SQL Safety

**All SQL queries use parameterized bindings via sqlx** (`$1`, `$2`, etc.). No string interpolation is used in any SQL query. The `rate_limiter.rs:173-178` direct query is also properly parameterized.

**sqlx compile-time query checking** is not used (`query_as::<_, Type>(...)` with runtime string queries rather than `sqlx::query!` macro). This is a deliberate choice documented in CLAUDE.md ("runtime-checked queries"), but means SQL errors are caught at runtime rather than compile time.

---

## Testing Assessment

**Strengths:**
- **Unit tests** are comprehensive across all modules: `errors.rs` (7 tests), `encryption.rs` (8 tests), `rate_limiter.rs` (13 tests), `scraper.rs` (20+ tests), `source_scraper.rs` (12 tests), `synthesis.rs` (10+ tests), all LLM providers (8-10 tests each), `schema.rs` (10 tests), `config.rs` (7 tests).
- **Integration tests** cover the full HTTP stack: auth flow, CSRF protection, settings, sources, admin, syntheses, API keys, export.
- **Security-specific tests**: SSRF IP checks (14 tests), soft-404 detection (6+ tests), error message sanitization, internal error hiding.
- **Edge cases** are well-covered: empty inputs, wrong keys, corrupted data, missing fields.

**Findings:**

26. **No tests for the synthesis generation pipeline itself.** The `run_generation_inner` function -- the most complex code in the project (~700 lines) -- has no unit or integration tests. Its helper functions are tested (`parse_llm_output`, `normalize_article_url`, `rotate_sources`, `sanitize_json_null_bytes`, `sanitize_error_message`), but the pipeline orchestration is untested. This is understandable given the complexity of mocking LLM providers and scraper results, but introducing a trait-based architecture for the scraper and DB layer would enable integration testing.

27. **Integration tests skip silently when `TEST_DATABASE_URL` is not set.** Each test starts with `if !require_test_db() { return; }`. This means CI must explicitly set the env var, and a misconfigured CI pipeline would report all tests as passing. Consider using `#[cfg(feature = "integration")]` or a custom test harness to make the requirement explicit.

28. **No concurrent/stress tests for rate limiters.** The `RateLimiter` and `ProviderRateLimiter` are tested for correctness but not for thread safety under contention. Given they use `DashMap` for lock-free access, a multi-threaded stress test would increase confidence.

29. **Missing negative tests for `create_provider` with empty API key.** The factory tests check unknown providers but not empty or whitespace-only API keys.

---

## Prioritized Recommendations

### High Priority (correctness/safety)

| # | Finding | File | Fix |
|---|---------|------|-----|
| 4 | `msg[..200]` panics on multi-byte UTF-8 | `synthesis.rs:1313` | Use `msg.chars().take(200)` or `char_indices()` |
| 10 | Spawned generation task swallows panics | `generation.rs:73-75` | Wrap in double-spawn or catch `JoinError` |
| 25 | Missing SSRF check on initial source page fetch | `source_scraper.rs:41` | Call `check_ssrf` before `http_client.get()` |

### Medium Priority (performance/maintainability)

| # | Finding | File | Fix |
|---|---------|------|-----|
| 1 | `AppConfig` cloned with secrets | `config.rs`, `main.rs:59` | Wrap in `Arc<AppConfig>` |
| 3 | Excessive cloning in synthesis pipeline | `synthesis.rs` (59 clones) | Use `Arc<str>`, `Arc<Value>`, `Arc<Vec<String>>` for shared immutables |
| 7 | N+1 individual INSERTs for article history | `article_history.rs:78-87` | Use batch INSERT or `UNNEST` |
| 19 | `Selector::parse` called per-invocation | `scraper.rs` | Use `LazyLock` for static selectors |
| 26 | No tests for generation pipeline | `synthesis.rs` | Introduce trait-based mocks for scraper/LLM/DB |

### Low Priority (code quality/future-proofing)

| # | Finding | File | Fix |
|---|---------|------|-----|
| 2 | `format!` allocation on every cookie segment | `middleware/auth.rs:40-41` | Pre-compute prefix string |
| 9 | TOCTOU race in `JobStore::create_job` | `synthesis.rs:108-145` | Use DashMap entry API for atomic check-and-insert |
| 12 | UUID parameters lack type distinction | Throughout | Introduce newtype wrappers |
| 13 | Provider names are bare strings | Throughout | Use `ProviderName` enum |
| 14 | `i32` to `u32` cast without validation | `rate_limiter.rs:188` | Use `TryFrom` or add DB constraint |
| 16 | Repeated `format!("category_{}", i)` | `synthesis.rs` | Pre-compute category keys |
| 24 | No length validation on LLM prompt inputs | Settings/synthesis | Cap `theme` and category name lengths |

### Code Duplication Worth Addressing

The **classify/summarize batch processing** logic appears twice in `synthesis.rs`: once for Phase 1 (personalized sources, lines ~458-541) and once for Phase 2 Brave search path (lines ~667-749). These are nearly identical blocks of ~80 lines each. Extracting a shared function like `classify_batch(articles, provider, model, categories, ...)` would significantly reduce the maintenance burden and risk of the two paths diverging.

---

## Appendix: Dependency Assessment

| Crate | Version | Assessment |
|-------|---------|------------|
| `axum` | 0.8 | Current, well-maintained |
| `sqlx` | 0.8 | Current, runtime-checked queries are a deliberate choice |
| `tokio` | 1 | Standard async runtime, full features enabled |
| `reqwest` | 0.12 | Current |
| `aes-gcm` | 0.10 | Current, correct choice for authenticated encryption |
| `zeroize` | 1 | Correctly applied to `MasterKey` |
| `dashmap` | 6 | Good choice for concurrent maps, correctly used |
| `async-trait` | 0.1 | Could be replaced by native async traits (Rust 1.75+) |
| `anyhow` + `thiserror` | 1 + 2 | Standard error handling combo, correctly used |
| `scraper` | 0.22 | HTML parsing, adequate for the use case |
| `rand` | 0.8 | Note: 0.9 is available, but 0.8 is still widely used |

No unnecessary or abandoned dependencies were identified. The dependency set is lean and focused.