You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ai_synth/docs/audit/00-consolidated-summary.md

6.0 KiB

Code Audit — Consolidated Summary

Date: 2026-03-26 Auditors: Software Architect, Frontend Expert, Rust Expert, Tech Lead, Devil's Advocate

Overall Assessment

The codebase is well-structured with strong fundamentals: clean layer separation, zero unsafe code, zero unwrap() in production paths, comprehensive security measures (AES-256-GCM encryption, SSRF prevention, anti-enumeration timing), and good test coverage (363+ unit tests, 9 integration test files). The code is a learning project that has evolved significantly through rapid iteration.

The main issues cluster around complexity in the synthesis pipeline and accumulated duplication from feature layering.


P0 — Fix Immediately

These are bugs that can crash or corrupt the application.

# Issue Source Files
1 UTF-8 panic: &msg[..200] panics on multi-byte chars Rust Expert, Devil's Advocate synthesis.rs:1313
2 Race condition in job creation: double-click creates duplicate jobs (DashMap check + insert not atomic) Devil's Advocate synthesis.rs (job management)
3 XSS via innerHTML: renders user-controlled theme unsafely Frontend Expert GenerateSynthesis.tsx

P1 — Fix Within 30 Days

# Issue Source Files
4 Expired sessions never cleaned up: sessions::delete_expired() exists but is never called Devil's Advocate startup / scheduled task
5 No pipeline timeout: hung LLM call blocks generation slot indefinitely Devil's Advocate synthesis.rs
6 SSRF bypass via IPv4-mapped IPv6: ::ffff:127.0.0.1 passes all current checks Rust Expert, Devil's Advocate scraper.rs
7 Missing SSRF check on initial source page fetch Rust Expert source_scraper.rs
8 Spawned generation tasks swallow panics: SSE clients hang forever Rust Expert generation handler
9 Missing onCleanup for setTimeout: timers fire after component unmount Frontend Expert Home.tsx, ArticleHistory.tsx, SynthesisDetail.tsx

Structural Refactoring — High Priority

These are the main maintainability issues, confirmed by 3+ auditors.

1. Decompose synthesis.rs (1,686 lines)

Confirmed by: Architect, Tech Lead, Rust Expert, Devil's Advocate

The run_generation_inner function is ~650 lines with the scrape+classify batch loop duplicated 3 times (Phase 1, Phase 2 Brave, Phase 2 LLM). Article filtering (homepage, dedup, history, diversity) is also duplicated across paths.

Refactoring plan:

  • Extract process_article_batch() — shared scrape+classify loop (~150 lines saved)
  • Extract filter_candidate_url() — shared filtering logic
  • Extract assign_category() — shared category assignment + overflow logic
  • Consider splitting into synthesis/pipeline.rs, synthesis/phases.rs, synthesis/helpers.rs

Effort: Large | Impact: Critical

2. Eliminate SettingsResponse struct

Confirmed by: Tech Lead

SettingsResponse is identical to UserSettings minus user_id and updated_at. Every new setting requires changes to 4 Rust structs + SQL queries + frontend types. Serialize UserSettings directly with #[serde(skip)] on internal fields.

Effort: Small | Impact: High (reduces per-setting change cost)

3. Decompose Settings.tsx (1,025 lines)

Confirmed by: Frontend Expert, Tech Lead

Split into sub-components: SettingsGeneral, SettingsProvider, SettingsAdvanced, SettingsBraveSearch, SettingsRateLimit.

Effort: Medium | Impact: High

4. Extract shared LLM error mapping

Confirmed by: Tech Lead

map_openai_error, map_gemini_error, map_anthropic_error are near-identical. Extract to a shared map_http_error function.

Effort: Small | Impact: Medium

5. Replace trace_article 11-parameter function with struct

Confirmed by: Tech Lead, Rust Expert

struct ArticleTrace { url, title, source_type, source_url, category, synthesis_id, status, scraped_ok }

Effort: Small | Impact: Medium


Medium Priority Improvements

# Issue Source
10 N+1 individual INSERTs for article history — batch with unnest() Rust Expert
11 59 .clone() calls in pipeline — use Arc<str> for shared immutables Rust Expert
12 CSS selectors re-parsed every call — use LazyLock Rust Expert
13 Inconsistent data fetching in frontend (createResource vs onMount + signals) Frontend Expert
14 Reusable Button component exists but unused — inline Tailwind duplicated Frontend Expert
15 Default settings diverge between frontend and backend Tech Lead
16 AppConfig carries secrets in cloneable String — use Arc-wrapped Rust Expert, Devil's Advocate
17 In-memory rate limits reset on restart Devil's Advocate
18 SESSION_SECRET env var validated but never used Devil's Advocate

Positive Highlights

All auditors noted these strengths:

  • Zero unsafe code, zero unwrap() in production
  • Strong security posture: AES-256-GCM with zeroized keys, SSRF prevention (14 tests), anti-enumeration timing, parameterized SQL
  • Clean error handling: AppError enum with proper HTTP mapping, internal errors never leaked
  • Well-designed LlmProvider trait: clean abstraction over 3 providers
  • Comprehensive test suite: good coverage on scraper, LLM providers, rate limiter, encryption
  • Excellent module documentation: every module has //! doc comments

Detailed Reports