You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

20 KiB

Raw Blame History

Rust Backend Audit Report

Date: 2026-03-26 Scope: backend/src/ -- all services, handlers, middleware, models, DB layer, and integration tests. Rust edition: 2021 | Key crates: axum 0.8, sqlx 0.8, tokio 1, reqwest 0.12, aes-gcm 0.10, dashmap 6

Executive Summary

The codebase is remarkably well-structured for a learning project. It demonstrates strong idiomatic Rust across most dimensions: proper error propagation with ?, thoughtful enum design (AppError), Arc-based shared state, and zero unsafe code. The architecture cleanly separates concerns (handlers / services / db / models), and the test suite covers all critical paths.

That said, the audit identifies three categories of findings:

Code quality issues (moderate priority): Significant code duplication in the synthesis pipeline, a potential panic in sanitize_error_message, and unnecessary allocations in hot paths.
Design opportunities (lower priority): Missing newtype patterns, opportunities to reduce Clone overhead via Arc<str>, and some functions with too many parameters that would benefit from builder/struct patterns.
Concurrency subtlety (low probability, worth noting): A TOCTOU race in JobStore::create_job between checking for existing jobs and inserting a new one.

No critical security issues were found. SQL injection is comprehensively prevented via sqlx parameterized queries. Secrets are never exposed in error responses. SSRF protection is thorough with a documented residual risk.

Idiomatic Rust

Ownership & Borrowing

Strengths:

MasterKey intentionally omits Clone to prevent key material duplication -- an excellent security-conscious design (encryption.rs:18).
Functions consistently accept &str / &[u8] over owned types where possible (e.g., encrypt(master_key: &MasterKey, plaintext: &str)).
The AuthUser extractor cleanly borrows from request parts and clones only the necessary fields.

Findings:

AppConfig holds secrets as String and derives Clone (config.rs:9-10). Every AppState clone copies the master_encryption_key and session_secret in full. Wrapping AppConfig in Arc<AppConfig> would eliminate these copies and reduce the blast radius of the key material in memory.
extract_session_token allocates a format! string on every cookie segment (middleware/auth.rs:40-41). The two format!("{}=", cookie_name) calls create heap-allocated strings for every iteration. These could be computed once before the iterator chain:
```
let prefix = format!("{}=", cookie_name);
cookie_header.split(';').map(|s| s.trim())
    .find(|s| s.starts_with(&prefix))
    .and_then(|s| s.strip_prefix(&prefix))
    .map(|s| s.to_string())
```
Excessive cloning in synthesis pipeline (synthesis.rs, ~59 .clone() calls). Many of these are necessary for spawned tasks, but several could be avoided:
- model_research.clone() is cloned into every spawned task. Wrapping it in Arc<str> would make clones zero-cost.
- classification_categories.clone() clones a Vec<String> per batch. Arc<Vec<String>> would be preferable.
- classify_schema.clone() clones a serde_json::Value per article. As an immutable value, this should be Arc<Value>.

Error Handling

Strengths:

AppError is a clean, well-designed enum with thiserror derivation and IntoResponse implementation.
Internal errors are properly logged but never exposed to clients (errors.rs:66-73).
The From<sqlx::Error> and From<anyhow::Error> conversions are correct and idiomatic.
.ok() is used appropriately for fire-and-forget operations like logging (synthesis.rs:953).

Findings:

msg[..200] in sanitize_error_message can panic on non-ASCII (synthesis.rs:1313). If the error message contains multi-byte UTF-8 characters and the 200th byte falls in the middle of a character, this will panic at runtime. Fix: use msg.chars().take(200).collect::<String>() or msg.char_indices() to find a safe boundary.
Selector::parse("a[href]").unwrap() in production code (source_scraper.rs:73,130). While Selector::parse with a static, known-valid selector string will never fail in practice, it would be more idiomatic to use lazy_static! or std::sync::LazyLock to parse the selector once and reuse it. This also avoids repeated parsing overhead.
unwrap_or("Unknown error") in error mappers (gemini.rs:156, openai.rs:163, anthropic.rs:199). These are safe because the value is only used for logging, but using unwrap_or_default() would be slightly more consistent with the rest of the codebase.

Iterator Patterns

Strengths:

Excellent use of iterator chains throughout: filter_map, find_map, chain, collect, take, any, position.
extract_links_from_html and extract_links_as_pairs are clean, functional-style implementations.
sanitize_json_null_bytes is an elegant recursive function using into_iter().map().collect().

Findings:

insert_urls in article_history.rs:78-87 executes individual INSERT queries in a loop. For bulk inserts, this should use a single query with UNNEST or build a multi-row INSERT, which would be both faster and more idiomatic:
```
// Instead of N individual queries:
sqlx::query("INSERT INTO ... SELECT * FROM UNNEST($1::uuid[], $2::text[], $3::text[]) ...")
```

Trait Usage

Strengths:

LlmProvider trait (llm/mod.rs:22-39) is well-designed: async_trait is correctly applied, Send + Sync bounds enable use behind Arc<dyn LlmProvider>.
FromRequestParts implementation for AuthUser and AdminUser is textbook Axum usage.
TryFrom<SettingsRow> for UserSettings cleanly separates DB row types from domain types.

Findings:

async_trait could be removed in favor of native async traits. Since Rust 1.75+ (stable), async traits work natively for dyn-compatible traits. The async_trait crate adds hidden Box::pin allocations on every call. However, the current edition is 2021, and async_trait is still the pragmatic choice for trait objects in some contexts. Worth revisiting when upgrading to edition 2024.

Async Patterns

Concurrency

Strengths:

JoinSet is used correctly for bounded concurrent scraping (synthesis.rs:296-372). The manual "seed initial + spawn on completion" pattern correctly limits concurrency to 5.
tokio::sync::watch is an excellent choice for SSE progress reporting -- exactly the right channel type for "latest value" semantics.
The _rx field in JobEntry (synthesis.rs:71) is a smart pattern: keeping a receiver alive prevents send() from returning an error.

Findings:

TOCTOU race in JobStore::create_job (synthesis.rs:108-145). The function iterates all entries to check for existing active jobs, then inserts a new entry. Between the check and the insert, another thread could create a job for the same user. While unlikely (generation is user-initiated and the check in trigger_generate provides a first-line defense), the DashMap entry API could enforce atomicity:
```
// Atomic check-and-insert would prevent the race
```
In practice this is low-risk because the handler also checks has_active_job before calling create_job, and the window is microseconds.
Spawned tasks silently swallow panics (generation.rs:73-75). The tokio::spawn for run_generation does not handle JoinError::Panic. If the generation pipeline panics (e.g., due to finding #4 above), the spawned task silently dies, and the SSE stream never sends a completion/error event. The client would hang indefinitely. Consider:
```
tokio::spawn(async move {
    let result = tokio::spawn(synthesis::run_generation(...)).await;
    if let Err(join_error) = result {
        tx.send(ProgressEvent::Error { message: "Internal error".into() }).ok();
    }
});
```

Error Propagation in Spawned Tasks

Strengths:

run_generation wraps run_generation_inner cleanly, converting Result into progress events (synthesis.rs:213-230).
Error sanitization prevents API key leakage in SSE error events (synthesis.rs:1291-1317).

Findings:

check_rate_limit uses a busy-wait retry loop with tokio::time::sleep (synthesis.rs:1029-1055). While the sleep prevents CPU spinning, the check() call consumes a rate limit slot even when the request is denied. On each loop iteration, check() is called, which records a timestamp if allowed but also modifies state. This means that repeated denied checks do not consume slots (only successful ones do), so this is actually correct -- but the code comment could be clearer about why check() is safe to call in a loop.

Channel Usage

Strengths:

watch::channel is the ideal choice for progress events: multiple subscribers, always the latest value, no backpressure needed.
WatchStream adapter in the SSE handler (generation.rs:113) correctly bridges the watch channel to an SSE stream.
Keepalive pings every 15s prevent proxy timeouts (generation.rs:126).

Type System

Missing Newtype Opportunities

User IDs, Job IDs, and Synthesis IDs are all Uuid throughout the codebase. A newtype pattern would prevent accidentally passing a user_id where a job_id is expected:
```
pub struct UserId(pub Uuid);
pub struct JobId(pub Uuid);
pub struct SynthesisId(pub Uuid);
```
The trace_article function takes 3 Uuid parameters (synthesis.rs:927-930), making it easy to swap user_id and job_id silently.
Provider names are bare String/&str everywhere. A ProviderName enum (Gemini | OpenAI | Anthropic) would eliminate the catch-all _ => branches in match statements and make invalid provider names a compile-time error rather than a runtime error.

Type Safety Gaps

RateLimitConfigRow casts i32 to u32/u64 without validation (rate_limiter.rs:188-189):
```
max_requests: row.max_requests as u32,
time_window: Duration::from_secs(row.time_window_seconds as u64),
```
Negative values from the database would silently wrap to large positive values. A TryFrom or explicit validation would be safer.
UserRateLimitEntry::new casts i32 to usize/u64 (app_state.rs:46-48) with the same wrapping risk for negative values.

Memory & Performance

Unnecessary Allocations

format!("category_{}", i) is computed repeatedly in inner loops (synthesis.rs:514-518, 720-725). These category keys are deterministic and could be pre-computed once into a Vec<String> before the main loop.
body_text.chars().take(500).collect::<String>() (synthesis.rs:467,676). This allocates a new string for each article. Using &body_text[..find_char_boundary(body_text, 500)] would avoid the allocation entirely (with a helper to find a safe UTF-8 boundary).
hash_article_url recomputes the hash multiple times for the same URL during history filtering (synthesis.rs:377-386). The hash is computed once in line 377 for the batch, then again in line 381 for each filtered URL. This is redundant -- the batch result could use a HashMap<String, String> (hash -> url) for O(1) lookups.
Selector::parse is called per-invocation in extract_page_title, detect_soft_404, detect_canonical_error, etc. (scraper.rs:325-410). These CSS selectors are constants that should be parsed once and cached. Using std::sync::LazyLock (stable in Rust 1.80) or once_cell::sync::Lazy:
```
static TITLE_SEL: LazyLock<Selector> = LazyLock::new(|| Selector::parse("title").unwrap());
```

Clone Efficiency

AppConfig is cloned when creating AppState (main.rs:59), copying all String fields including the master_encryption_key. Since AppConfig is effectively immutable after startup, wrapping it in Arc would be more appropriate.
settings.categories.clone() at synthesis.rs:261 clones a Vec<String> that is only used to build classification_categories. This could use a reference or be constructed more efficiently:
```
let classification_categories: Vec<&str> = settings.categories.iter()
    .map(|s| s.as_str())
    .chain(std::iter::once("Autre"))
    .collect();
```
(Though this would require adjusting downstream code that expects owned strings.)

Safety Assessment

Unwrap Usage

No unwrap() calls exist in production code paths (non-test, non-cfg(test) code), with two exceptions:

Selector::parse("a[href]").unwrap() in source_scraper.rs:73,130 -- These are static selectors that will never fail. Acceptable but could use LazyLock for clarity.
unwrap_or("") and unwrap_or("Unknown error") in LLM error mappers -- These are safe fallback patterns.

expect() usage is limited to:

main.rs:122,128 -- signal handler installation (correct; failure here is unrecoverable)
auth.rs:47,104 -- TimeDelta::try_minutes/try_days with constant values (correct; these are compile-time-knowable values)

Input Validation

Strengths:

Email validation via the email_address crate.
Turnstile captcha verification on auth endpoints.
CSRF protection via X-Requested-With header check.
URL scheme validation and SSRF prevention with comprehensive IPv4/IPv6 coverage.
API key prefix extraction prevents full key exposure.
Body size limits on scraper responses (MAX_BODY_SIZE = 5MB).

Findings:

No input length validation on LLM prompts. The build_search_prompt and build_article_classify_prompt functions accept arbitrary-length category names and theme strings from user settings. An extremely long theme string could lead to excessive LLM token usage. Consider capping settings.theme length at the handler/validation layer.
source_url is not validated against SSRF before fetching in source_scraper.rs:41-48. The extract_article_links function uses the shared http_client (which is the scraper client with SSRF-safe redirect handling), but the initial DNS resolution check (check_ssrf) is not performed before the GET request. The redirect policy catches private IPs on redirects, but the initial request to a private IP would succeed. The extract_article_links_with_llm function has the same issue.

SQL Safety

All SQL queries use parameterized bindings via sqlx ($1, $2, etc.). No string interpolation is used in any SQL query. The rate_limiter.rs:173-178 direct query is also properly parameterized.

sqlx compile-time query checking is not used (query_as::<_, Type>(...) with runtime string queries rather than sqlx::query! macro). This is a deliberate choice documented in CLAUDE.md ("runtime-checked queries"), but means SQL errors are caught at runtime rather than compile time.

Testing Assessment

Strengths:

Unit tests are comprehensive across all modules: errors.rs (7 tests), encryption.rs (8 tests), rate_limiter.rs (13 tests), scraper.rs (20+ tests), source_scraper.rs (12 tests), synthesis.rs (10+ tests), all LLM providers (8-10 tests each), schema.rs (10 tests), config.rs (7 tests).
Integration tests cover the full HTTP stack: auth flow, CSRF protection, settings, sources, admin, syntheses, API keys, export.
Security-specific tests: SSRF IP checks (14 tests), soft-404 detection (6+ tests), error message sanitization, internal error hiding.
Edge cases are well-covered: empty inputs, wrong keys, corrupted data, missing fields.

Findings:

No tests for the synthesis generation pipeline itself. The run_generation_inner function -- the most complex code in the project (~700 lines) -- has no unit or integration tests. Its helper functions are tested (parse_llm_output, normalize_article_url, rotate_sources, sanitize_json_null_bytes, sanitize_error_message), but the pipeline orchestration is untested. This is understandable given the complexity of mocking LLM providers and scraper results, but introducing a trait-based architecture for the scraper and DB layer would enable integration testing.
Integration tests skip silently when TEST_DATABASE_URL is not set. Each test starts with if !require_test_db() { return; }. This means CI must explicitly set the env var, and a misconfigured CI pipeline would report all tests as passing. Consider using #[cfg(feature = "integration")] or a custom test harness to make the requirement explicit.
No concurrent/stress tests for rate limiters. The RateLimiter and ProviderRateLimiter are tested for correctness but not for thread safety under contention. Given they use DashMap for lock-free access, a multi-threaded stress test would increase confidence.
Missing negative tests for create_provider with empty API key. The factory tests check unknown providers but not empty or whitespace-only API keys.

Prioritized Recommendations

High Priority (correctness/safety)

#	Finding	File	Fix
4	`msg[..200]` panics on multi-byte UTF-8	`synthesis.rs:1313`	Use `msg.chars().take(200)` or `char_indices()`
10	Spawned generation task swallows panics	`generation.rs:73-75`	Wrap in double-spawn or catch `JoinError`
25	Missing SSRF check on initial source page fetch	`source_scraper.rs:41`	Call `check_ssrf` before `http_client.get()`

Medium Priority (performance/maintainability)

#	Finding	File	Fix
1	`AppConfig` cloned with secrets	`config.rs`, `main.rs:59`	Wrap in `Arc<AppConfig>`
3	Excessive cloning in synthesis pipeline	`synthesis.rs` (59 clones)	Use `Arc<str>`, `Arc<Value>`, `Arc<Vec<String>>` for shared immutables
7	N+1 individual INSERTs for article history	`article_history.rs:78-87`	Use batch INSERT or `UNNEST`
19	`Selector::parse` called per-invocation	`scraper.rs`	Use `LazyLock` for static selectors
26	No tests for generation pipeline	`synthesis.rs`	Introduce trait-based mocks for scraper/LLM/DB

Low Priority (code quality/future-proofing)

#	Finding	File	Fix
2	`format!` allocation on every cookie segment	`middleware/auth.rs:40-41`	Pre-compute prefix string
9	TOCTOU race in `JobStore::create_job`	`synthesis.rs:108-145`	Use DashMap entry API for atomic check-and-insert
12	UUID parameters lack type distinction	Throughout	Introduce newtype wrappers
13	Provider names are bare strings	Throughout	Use `ProviderName` enum
14	`i32` to `u32` cast without validation	`rate_limiter.rs:188`	Use `TryFrom` or add DB constraint
16	Repeated `format!("category_{}", i)`	`synthesis.rs`	Pre-compute category keys
24	No length validation on LLM prompt inputs	Settings/synthesis	Cap `theme` and category name lengths

Code Duplication Worth Addressing

The classify/summarize batch processing logic appears twice in synthesis.rs: once for Phase 1 (personalized sources, lines ~458-541) and once for Phase 2 Brave search path (lines ~667-749). These are nearly identical blocks of ~80 lines each. Extracting a shared function like classify_batch(articles, provider, model, categories, ...) would significantly reduce the maintenance burden and risk of the two paths diverging.

Appendix: Dependency Assessment

Crate	Version	Assessment
`axum`	0.8	Current, well-maintained
`sqlx`	0.8	Current, runtime-checked queries are a deliberate choice
`tokio`	1	Standard async runtime, full features enabled
`reqwest`	0.12	Current
`aes-gcm`	0.10	Current, correct choice for authenticated encryption
`zeroize`	1	Correctly applied to `MasterKey`
`dashmap`	6	Good choice for concurrent maps, correctly used
`async-trait`	0.1	Could be replaced by native async traits (Rust 1.75+)
`anyhow` + `thiserror`	1 + 2	Standard error handling combo, correctly used
`scraper`	0.22	HTML parsing, adequate for the use case
`rand`	0.8	Note: 0.9 is available, but 0.8 is still widely used

No unnecessary or abandoned dependencies were identified. The dependency set is lean and focused.

20 KiB Raw Blame History

Rust Backend Audit Report

Executive Summary

Idiomatic Rust

Ownership & Borrowing

Error Handling

Iterator Patterns

Trait Usage

Async Patterns

Concurrency

Error Propagation in Spawned Tasks

Channel Usage

Type System

Missing Newtype Opportunities

Type Safety Gaps

Memory & Performance

Unnecessary Allocations

Clone Efficiency

Safety Assessment

Unwrap Usage

Input Validation

SQL Safety

Testing Assessment

Prioritized Recommendations

High Priority (correctness/safety)

Medium Priority (performance/maintainability)

Low Priority (code quality/future-proofing)

Code Duplication Worth Addressing

Appendix: Dependency Assessment

20 KiB

Raw Blame History