You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

20 KiB

Rust Backend Audit Report

Date: 2026-03-26 Scope: backend/src/ -- all services, handlers, middleware, models, DB layer, and integration tests. Rust edition: 2021 | Key crates: axum 0.8, sqlx 0.8, tokio 1, reqwest 0.12, aes-gcm 0.10, dashmap 6


Executive Summary

The codebase is remarkably well-structured for a learning project. It demonstrates strong idiomatic Rust across most dimensions: proper error propagation with ?, thoughtful enum design (AppError), Arc-based shared state, and zero unsafe code. The architecture cleanly separates concerns (handlers / services / db / models), and the test suite covers all critical paths.

That said, the audit identifies three categories of findings:

  1. Code quality issues (moderate priority): Significant code duplication in the synthesis pipeline, a potential panic in sanitize_error_message, and unnecessary allocations in hot paths.
  2. Design opportunities (lower priority): Missing newtype patterns, opportunities to reduce Clone overhead via Arc<str>, and some functions with too many parameters that would benefit from builder/struct patterns.
  3. Concurrency subtlety (low probability, worth noting): A TOCTOU race in JobStore::create_job between checking for existing jobs and inserting a new one.

No critical security issues were found. SQL injection is comprehensively prevented via sqlx parameterized queries. Secrets are never exposed in error responses. SSRF protection is thorough with a documented residual risk.


Idiomatic Rust

Ownership & Borrowing

Strengths:

  • MasterKey intentionally omits Clone to prevent key material duplication -- an excellent security-conscious design (encryption.rs:18).
  • Functions consistently accept &str / &[u8] over owned types where possible (e.g., encrypt(master_key: &MasterKey, plaintext: &str)).
  • The AuthUser extractor cleanly borrows from request parts and clones only the necessary fields.

Findings:

  1. AppConfig holds secrets as String and derives Clone (config.rs:9-10). Every AppState clone copies the master_encryption_key and session_secret in full. Wrapping AppConfig in Arc<AppConfig> would eliminate these copies and reduce the blast radius of the key material in memory.

  2. extract_session_token allocates a format! string on every cookie segment (middleware/auth.rs:40-41). The two format!("{}=", cookie_name) calls create heap-allocated strings for every iteration. These could be computed once before the iterator chain:

    let prefix = format!("{}=", cookie_name);
    cookie_header.split(';').map(|s| s.trim())
        .find(|s| s.starts_with(&prefix))
        .and_then(|s| s.strip_prefix(&prefix))
        .map(|s| s.to_string())
    
  3. Excessive cloning in synthesis pipeline (synthesis.rs, ~59 .clone() calls). Many of these are necessary for spawned tasks, but several could be avoided:

    • model_research.clone() is cloned into every spawned task. Wrapping it in Arc<str> would make clones zero-cost.
    • classification_categories.clone() clones a Vec<String> per batch. Arc<Vec<String>> would be preferable.
    • classify_schema.clone() clones a serde_json::Value per article. As an immutable value, this should be Arc<Value>.

Error Handling

Strengths:

  • AppError is a clean, well-designed enum with thiserror derivation and IntoResponse implementation.
  • Internal errors are properly logged but never exposed to clients (errors.rs:66-73).
  • The From<sqlx::Error> and From<anyhow::Error> conversions are correct and idiomatic.
  • .ok() is used appropriately for fire-and-forget operations like logging (synthesis.rs:953).

Findings:

  1. msg[..200] in sanitize_error_message can panic on non-ASCII (synthesis.rs:1313). If the error message contains multi-byte UTF-8 characters and the 200th byte falls in the middle of a character, this will panic at runtime. Fix: use msg.chars().take(200).collect::<String>() or msg.char_indices() to find a safe boundary.

  2. Selector::parse("a[href]").unwrap() in production code (source_scraper.rs:73,130). While Selector::parse with a static, known-valid selector string will never fail in practice, it would be more idiomatic to use lazy_static! or std::sync::LazyLock to parse the selector once and reuse it. This also avoids repeated parsing overhead.

  3. unwrap_or("Unknown error") in error mappers (gemini.rs:156, openai.rs:163, anthropic.rs:199). These are safe because the value is only used for logging, but using unwrap_or_default() would be slightly more consistent with the rest of the codebase.

Iterator Patterns

Strengths:

  • Excellent use of iterator chains throughout: filter_map, find_map, chain, collect, take, any, position.
  • extract_links_from_html and extract_links_as_pairs are clean, functional-style implementations.
  • sanitize_json_null_bytes is an elegant recursive function using into_iter().map().collect().

Findings:

  1. insert_urls in article_history.rs:78-87 executes individual INSERT queries in a loop. For bulk inserts, this should use a single query with UNNEST or build a multi-row INSERT, which would be both faster and more idiomatic:
    // Instead of N individual queries:
    sqlx::query("INSERT INTO ... SELECT * FROM UNNEST($1::uuid[], $2::text[], $3::text[]) ...")
    

Trait Usage

Strengths:

  • LlmProvider trait (llm/mod.rs:22-39) is well-designed: async_trait is correctly applied, Send + Sync bounds enable use behind Arc<dyn LlmProvider>.
  • FromRequestParts implementation for AuthUser and AdminUser is textbook Axum usage.
  • TryFrom<SettingsRow> for UserSettings cleanly separates DB row types from domain types.

Findings:

  1. async_trait could be removed in favor of native async traits. Since Rust 1.75+ (stable), async traits work natively for dyn-compatible traits. The async_trait crate adds hidden Box::pin allocations on every call. However, the current edition is 2021, and async_trait is still the pragmatic choice for trait objects in some contexts. Worth revisiting when upgrading to edition 2024.

Async Patterns

Concurrency

Strengths:

  • JoinSet is used correctly for bounded concurrent scraping (synthesis.rs:296-372). The manual "seed initial + spawn on completion" pattern correctly limits concurrency to 5.
  • tokio::sync::watch is an excellent choice for SSE progress reporting -- exactly the right channel type for "latest value" semantics.
  • The _rx field in JobEntry (synthesis.rs:71) is a smart pattern: keeping a receiver alive prevents send() from returning an error.

Findings:

  1. TOCTOU race in JobStore::create_job (synthesis.rs:108-145). The function iterates all entries to check for existing active jobs, then inserts a new entry. Between the check and the insert, another thread could create a job for the same user. While unlikely (generation is user-initiated and the check in trigger_generate provides a first-line defense), the DashMap entry API could enforce atomicity:

    // Atomic check-and-insert would prevent the race
    

    In practice this is low-risk because the handler also checks has_active_job before calling create_job, and the window is microseconds.

  2. Spawned tasks silently swallow panics (generation.rs:73-75). The tokio::spawn for run_generation does not handle JoinError::Panic. If the generation pipeline panics (e.g., due to finding #4 above), the spawned task silently dies, and the SSE stream never sends a completion/error event. The client would hang indefinitely. Consider:

    tokio::spawn(async move {
        let result = tokio::spawn(synthesis::run_generation(...)).await;
        if let Err(join_error) = result {
            tx.send(ProgressEvent::Error { message: "Internal error".into() }).ok();
        }
    });
    

Error Propagation in Spawned Tasks

Strengths:

  • run_generation wraps run_generation_inner cleanly, converting Result into progress events (synthesis.rs:213-230).
  • Error sanitization prevents API key leakage in SSE error events (synthesis.rs:1291-1317).

Findings:

  1. check_rate_limit uses a busy-wait retry loop with tokio::time::sleep (synthesis.rs:1029-1055). While the sleep prevents CPU spinning, the check() call consumes a rate limit slot even when the request is denied. On each loop iteration, check() is called, which records a timestamp if allowed but also modifies state. This means that repeated denied checks do not consume slots (only successful ones do), so this is actually correct -- but the code comment could be clearer about why check() is safe to call in a loop.

Channel Usage

Strengths:

  • watch::channel is the ideal choice for progress events: multiple subscribers, always the latest value, no backpressure needed.
  • WatchStream adapter in the SSE handler (generation.rs:113) correctly bridges the watch channel to an SSE stream.
  • Keepalive pings every 15s prevent proxy timeouts (generation.rs:126).

Type System

Missing Newtype Opportunities

  1. User IDs, Job IDs, and Synthesis IDs are all Uuid throughout the codebase. A newtype pattern would prevent accidentally passing a user_id where a job_id is expected:

    pub struct UserId(pub Uuid);
    pub struct JobId(pub Uuid);
    pub struct SynthesisId(pub Uuid);
    

    The trace_article function takes 3 Uuid parameters (synthesis.rs:927-930), making it easy to swap user_id and job_id silently.

  2. Provider names are bare String/&str everywhere. A ProviderName enum (Gemini | OpenAI | Anthropic) would eliminate the catch-all _ => branches in match statements and make invalid provider names a compile-time error rather than a runtime error.

Type Safety Gaps

  1. RateLimitConfigRow casts i32 to u32/u64 without validation (rate_limiter.rs:188-189):

    max_requests: row.max_requests as u32,
    time_window: Duration::from_secs(row.time_window_seconds as u64),
    

    Negative values from the database would silently wrap to large positive values. A TryFrom or explicit validation would be safer.

  2. UserRateLimitEntry::new casts i32 to usize/u64 (app_state.rs:46-48) with the same wrapping risk for negative values.


Memory & Performance

Unnecessary Allocations

  1. format!("category_{}", i) is computed repeatedly in inner loops (synthesis.rs:514-518, 720-725). These category keys are deterministic and could be pre-computed once into a Vec<String> before the main loop.

  2. body_text.chars().take(500).collect::<String>() (synthesis.rs:467,676). This allocates a new string for each article. Using &body_text[..find_char_boundary(body_text, 500)] would avoid the allocation entirely (with a helper to find a safe UTF-8 boundary).

  3. hash_article_url recomputes the hash multiple times for the same URL during history filtering (synthesis.rs:377-386). The hash is computed once in line 377 for the batch, then again in line 381 for each filtered URL. This is redundant -- the batch result could use a HashMap<String, String> (hash -> url) for O(1) lookups.

  4. Selector::parse is called per-invocation in extract_page_title, detect_soft_404, detect_canonical_error, etc. (scraper.rs:325-410). These CSS selectors are constants that should be parsed once and cached. Using std::sync::LazyLock (stable in Rust 1.80) or once_cell::sync::Lazy:

    static TITLE_SEL: LazyLock<Selector> = LazyLock::new(|| Selector::parse("title").unwrap());
    

Clone Efficiency

  1. AppConfig is cloned when creating AppState (main.rs:59), copying all String fields including the master_encryption_key. Since AppConfig is effectively immutable after startup, wrapping it in Arc would be more appropriate.

  2. settings.categories.clone() at synthesis.rs:261 clones a Vec<String> that is only used to build classification_categories. This could use a reference or be constructed more efficiently:

    let classification_categories: Vec<&str> = settings.categories.iter()
        .map(|s| s.as_str())
        .chain(std::iter::once("Autre"))
        .collect();
    

    (Though this would require adjusting downstream code that expects owned strings.)


Safety Assessment

Unwrap Usage

No unwrap() calls exist in production code paths (non-test, non-cfg(test) code), with two exceptions:

  1. Selector::parse("a[href]").unwrap() in source_scraper.rs:73,130 -- These are static selectors that will never fail. Acceptable but could use LazyLock for clarity.

  2. unwrap_or("") and unwrap_or("Unknown error") in LLM error mappers -- These are safe fallback patterns.

expect() usage is limited to:

  • main.rs:122,128 -- signal handler installation (correct; failure here is unrecoverable)
  • auth.rs:47,104 -- TimeDelta::try_minutes/try_days with constant values (correct; these are compile-time-knowable values)

Input Validation

Strengths:

  • Email validation via the email_address crate.
  • Turnstile captcha verification on auth endpoints.
  • CSRF protection via X-Requested-With header check.
  • URL scheme validation and SSRF prevention with comprehensive IPv4/IPv6 coverage.
  • API key prefix extraction prevents full key exposure.
  • Body size limits on scraper responses (MAX_BODY_SIZE = 5MB).

Findings:

  1. No input length validation on LLM prompts. The build_search_prompt and build_article_classify_prompt functions accept arbitrary-length category names and theme strings from user settings. An extremely long theme string could lead to excessive LLM token usage. Consider capping settings.theme length at the handler/validation layer.

  2. source_url is not validated against SSRF before fetching in source_scraper.rs:41-48. The extract_article_links function uses the shared http_client (which is the scraper client with SSRF-safe redirect handling), but the initial DNS resolution check (check_ssrf) is not performed before the GET request. The redirect policy catches private IPs on redirects, but the initial request to a private IP would succeed. The extract_article_links_with_llm function has the same issue.

SQL Safety

All SQL queries use parameterized bindings via sqlx ($1, $2, etc.). No string interpolation is used in any SQL query. The rate_limiter.rs:173-178 direct query is also properly parameterized.

sqlx compile-time query checking is not used (query_as::<_, Type>(...) with runtime string queries rather than sqlx::query! macro). This is a deliberate choice documented in CLAUDE.md ("runtime-checked queries"), but means SQL errors are caught at runtime rather than compile time.


Testing Assessment

Strengths:

  • Unit tests are comprehensive across all modules: errors.rs (7 tests), encryption.rs (8 tests), rate_limiter.rs (13 tests), scraper.rs (20+ tests), source_scraper.rs (12 tests), synthesis.rs (10+ tests), all LLM providers (8-10 tests each), schema.rs (10 tests), config.rs (7 tests).
  • Integration tests cover the full HTTP stack: auth flow, CSRF protection, settings, sources, admin, syntheses, API keys, export.
  • Security-specific tests: SSRF IP checks (14 tests), soft-404 detection (6+ tests), error message sanitization, internal error hiding.
  • Edge cases are well-covered: empty inputs, wrong keys, corrupted data, missing fields.

Findings:

  1. No tests for the synthesis generation pipeline itself. The run_generation_inner function -- the most complex code in the project (~700 lines) -- has no unit or integration tests. Its helper functions are tested (parse_llm_output, normalize_article_url, rotate_sources, sanitize_json_null_bytes, sanitize_error_message), but the pipeline orchestration is untested. This is understandable given the complexity of mocking LLM providers and scraper results, but introducing a trait-based architecture for the scraper and DB layer would enable integration testing.

  2. Integration tests skip silently when TEST_DATABASE_URL is not set. Each test starts with if !require_test_db() { return; }. This means CI must explicitly set the env var, and a misconfigured CI pipeline would report all tests as passing. Consider using #[cfg(feature = "integration")] or a custom test harness to make the requirement explicit.

  3. No concurrent/stress tests for rate limiters. The RateLimiter and ProviderRateLimiter are tested for correctness but not for thread safety under contention. Given they use DashMap for lock-free access, a multi-threaded stress test would increase confidence.

  4. Missing negative tests for create_provider with empty API key. The factory tests check unknown providers but not empty or whitespace-only API keys.


Prioritized Recommendations

High Priority (correctness/safety)

# Finding File Fix
4 msg[..200] panics on multi-byte UTF-8 synthesis.rs:1313 Use msg.chars().take(200) or char_indices()
10 Spawned generation task swallows panics generation.rs:73-75 Wrap in double-spawn or catch JoinError
25 Missing SSRF check on initial source page fetch source_scraper.rs:41 Call check_ssrf before http_client.get()

Medium Priority (performance/maintainability)

# Finding File Fix
1 AppConfig cloned with secrets config.rs, main.rs:59 Wrap in Arc<AppConfig>
3 Excessive cloning in synthesis pipeline synthesis.rs (59 clones) Use Arc<str>, Arc<Value>, Arc<Vec<String>> for shared immutables
7 N+1 individual INSERTs for article history article_history.rs:78-87 Use batch INSERT or UNNEST
19 Selector::parse called per-invocation scraper.rs Use LazyLock for static selectors
26 No tests for generation pipeline synthesis.rs Introduce trait-based mocks for scraper/LLM/DB

Low Priority (code quality/future-proofing)

# Finding File Fix
2 format! allocation on every cookie segment middleware/auth.rs:40-41 Pre-compute prefix string
9 TOCTOU race in JobStore::create_job synthesis.rs:108-145 Use DashMap entry API for atomic check-and-insert
12 UUID parameters lack type distinction Throughout Introduce newtype wrappers
13 Provider names are bare strings Throughout Use ProviderName enum
14 i32 to u32 cast without validation rate_limiter.rs:188 Use TryFrom or add DB constraint
16 Repeated format!("category_{}", i) synthesis.rs Pre-compute category keys
24 No length validation on LLM prompt inputs Settings/synthesis Cap theme and category name lengths

Code Duplication Worth Addressing

The classify/summarize batch processing logic appears twice in synthesis.rs: once for Phase 1 (personalized sources, lines ~458-541) and once for Phase 2 Brave search path (lines ~667-749). These are nearly identical blocks of ~80 lines each. Extracting a shared function like classify_batch(articles, provider, model, categories, ...) would significantly reduce the maintenance burden and risk of the two paths diverging.


Appendix: Dependency Assessment

Crate Version Assessment
axum 0.8 Current, well-maintained
sqlx 0.8 Current, runtime-checked queries are a deliberate choice
tokio 1 Standard async runtime, full features enabled
reqwest 0.12 Current
aes-gcm 0.10 Current, correct choice for authenticated encryption
zeroize 1 Correctly applied to MasterKey
dashmap 6 Good choice for concurrent maps, correctly used
async-trait 0.1 Could be replaced by native async traits (Rust 1.75+)
anyhow + thiserror 1 + 2 Standard error handling combo, correctly used
scraper 0.22 HTML parsing, adequate for the use case
rand 0.8 Note: 0.9 is available, but 0.8 is still widely used

No unnecessary or abandoned dependencies were identified. The dependency set is lean and focused.