24 KiB
AI Weekly Synth -- Architecture Audit Report (v2)
Date: 2026-03-27 Scope: Full backend codebase (Rust/Axum), key frontend architecture observations Auditor: Software Architect (automated)
Executive Summary
AI Weekly Synth is a well-structured Rust/Axum application that has grown substantially from its initial design. The codebase demonstrates strong fundamentals: consistent error handling, good security practices, clean layer separation between handlers/services/db, and idiomatic use of Axum extractors. Unit test coverage is solid across models, services, and middleware.
However, the growth -- particularly the addition of multi-theme synthesis, scheduled generation, Brave Search, windowed source extraction, and article history tracing -- has introduced several architectural tensions. The synthesis pipeline (synthesis.rs) has become a 1500+ line monolith carrying at least five distinct responsibilities. The scheduler bypasses the job store abstraction. And several cross-cutting concerns (provider resolution, rate limiting, history tracking) are tightly coupled to concrete implementations, making the system harder to test, extend, and reason about.
This report organizes findings by SOLID principles, design patterns, architecture, and dependency management, then closes with prioritized recommendations.
1. SOLID Principles
1.1 Single Responsibility Principle (SRP)
Critical: synthesis.rs is a God Module
At 1500+ lines, services/synthesis.rs carries at least six distinct responsibilities:
| Responsibility | Lines (approx.) | Should be |
|---|---|---|
| Job store (in-memory concurrent map) | 1-193 | Own module services/job_store.rs |
| Progress event types + emission | 37-71, 1063-1071 | Own module or part of job store |
| Pipeline orchestration (phases 1 + 2 + save) | 200-1038 | services/pipeline.rs or services/generation/mod.rs |
| Article scraping + classification logic | 471-616, 700-880 | services/article_processor.rs |
| URL filtering, normalization, hashing | 1255-1339 | services/url_utils.rs |
| Provider/model/key resolution | 1342-1446 | services/provider_resolver.rs |
The run_generation_inner function alone is ~840 lines. It manages five HashMap/HashSet tracking structures, two nested loop levels (waves and batches), two separate pipeline phases (personalized sources and web search fallback), and three code paths in Phase 2 (Brave Search, LLM search, skip). This makes the function extremely difficult to test in isolation, review for correctness, or extend with new pipeline stages.
Moderate: scheduler.rs duplicates pipeline invocation logic
The scheduler constructs its own watch::channel and AtomicBool, calls run_generation_inner directly, and handles email sending inline. It bypasses the JobStore entirely, which means:
- Scheduled jobs are invisible to the SSE progress API
- The one-job-per-user guard does not apply (it only checks
job_store.has_active_job) - Email sending logic (fetch synthesis, iterate recipients, call
email::send_synthesis_email) is duplicated -- the handler version is insyntheses.rshandler
Moderate: AppState accumulates responsibilities
AppState holds configuration, database pool, HTTP client, auth rate limiter, provider rate limiter, per-user rate limiters, and the job store. While Clone-cheap (all Arc-based), it acts as a service locator, making it unclear which components a given handler actually depends on. With 8 fields, this is approaching the point where injecting specific dependencies would improve clarity.
Minor: Handler-level response types
Some response types are defined in handlers (AdminUserResponse in handlers/admin.rs, GenerateResponse in handlers/generation.rs, ListResponse in handlers/syntheses.rs) while others are in models. This inconsistency is minor but creates ambiguity about where to look for types.
1.2 Open/Closed Principle (OCP)
Well-applied: LLM Provider abstraction
The LlmProvider trait + factory pattern is the cleanest abstraction in the codebase. Adding a new provider (e.g., Mistral) requires:
- A new module implementing
LlmProvider - A new match arm in
factory.rs - No changes to the pipeline
This is textbook OCP.
Violation: Pipeline Phase 2 branching
Phase 2 of the pipeline has a hard-coded if settings.use_brave_search { ... } else { ... } branch that selects between two entirely different code paths (Brave Search vs. LLM web search). Each path contains ~150 lines of nearly identical scrape+classify logic. Adding a third search strategy (e.g., Bing, Perplexity, SearXNG) would require another else if branch with the same duplicated scrape/classify logic.
Violation: Provider resolution fallback defaults
resolve_model contains hard-coded fallback model names ("gemini-2.5-pro", "gpt-4o", "claude-sonnet-4-20250514"). These will silently become stale as providers release new models. The fallback chain should be configurable or fail loudly.
1.3 Liskov Substitution Principle (LSP)
Generally well-respected. The LlmProvider trait implementations are fully substitutable. The MockLlmProvider correctly implements the same interface and is used in tests via the provider_override parameter.
Minor concern: The mock provider identifies call types by inspecting the system prompt content (sys_lower.contains("classer"), sys_lower.contains("precis")). This couples the mock to the French-language prompt wording, making it fragile if prompts change.
1.4 Interface Segregation Principle (ISP)
Well-applied: Axum extractors
AuthUser and AdminUser are clean, focused extractors. Handlers declare exactly the auth level they need. The AdminUser wrapper pattern (newtype over AuthUser) is idiomatic and minimal.
Opportunity: LlmProvider trait could be narrower
The current trait has two methods: provider_id() and call_llm(). If the codebase later needs streaming, embedding, or tool-calling capabilities, the trait should be split rather than extended, per ISP.
1.5 Dependency Inversion Principle (DIP)
Critical: Pipeline depends on concrete implementations
run_generation_inner directly calls:
db::settings::get_or_create_default(concrete DB queries)db::themes::get_by_id(concrete DB queries)db::sources::list_for_user(concrete DB queries)db::article_history::check_urls_exist(concrete DB queries)db::article_history::batch_insert_entries(concrete DB queries)crate::services::scraper::scrape_url(concrete HTTP scraping)source_scraper::extract_article_links(concrete link extraction)crate::services::brave_search::search(concrete Brave API)encryption::decrypt/encryption::MasterKey::from_hex(concrete crypto)
None of these are injected as traits. The provider_override parameter for LlmProvider is the only dependency that can be swapped -- and it was added specifically for testing. This makes the pipeline untestable without a live Postgres database and network access.
Moderate: ProviderRateLimiter embeds its own SQL
The ProviderRateLimiter::reload_from_db method contains raw sqlx::query_as calls rather than going through the db::rate_limits module. The comment says "to avoid circular dependency," but this violates the layer boundary and duplicates the DB schema knowledge.
2. Design Patterns
2.1 Well-Applied Patterns
| Pattern | Where | Assessment |
|---|---|---|
| Factory Method | llm/factory.rs |
Clean, tested, extensible |
| Strategy | LlmProvider trait |
Proper polymorphism via Arc<dyn LlmProvider> |
| Observer | watch::channel for SSE |
Elegant use of tokio primitives; late subscribers get latest state |
| Repository | db/ modules |
Clean separation of SQL from business logic |
| Extractor | AuthUser, AdminUser |
Idiomatic Axum; composable auth |
| Builder | AppState::new, build_scraper_client |
Consistent construction patterns |
| Newtype | AdminUser(AuthUser), MasterKey |
Type safety for authorization and crypto |
2.2 Missing or Needed Patterns
Pipeline / Chain of Responsibility
The synthesis generation is conceptually a pipeline with discrete stages:
- Load settings + theme + sources
- Phase 1: Source extraction (windowed)
- Phase 1: Scrape + classify (batched)
- Phase 2: Web search fallback (Brave or LLM)
- Phase 2: Scrape + classify fallback results
- Assemble sections + save
Each stage could be a separate struct implementing a PipelineStage trait, with shared context passed through. This would make the pipeline testable per-stage, enable adding/removing stages declaratively, and reduce run_generation_inner from 840 lines to ~50.
Unit of Work / Transaction Manager
Article history tracing uses a manual pending_traces buffer that is flushed at "logical boundaries." This ad-hoc batching is scattered across 7 locations in the pipeline. A dedicated TraceBatcher struct could encapsulate the buffer, auto-flush thresholds, and error handling.
Event Sourcing (lightweight)
The ProgressEvent enum is close to an event-sourced model but is currently fire-and-forget via watch::channel (which only retains the latest value). If the system needs progress history for debugging or UI replay, the events should be collected in a log alongside the watch channel.
2.3 Anti-Patterns
Copy-Paste Programming (Critical)
The scrape+classify logic appears nearly identically in three places:
- Phase 1 source processing (lines ~470-616)
- Phase 2 Brave Search processing (lines ~700-880)
- Phase 2 LLM search validation (lines ~936-960, simpler variant)
Each instance: spawns a JoinSet for scraping, collects results, checks rate limits, spawns a JoinSet for classification, parses LLM responses, checks is_article, extracts dates, assigns categories, updates tracking maps. The only differences are: which source_type string is recorded and whether source_url is Some.
A single scrape_and_classify_batch function parameterized by source type would eliminate ~300 lines of duplication.
Primitive Obsession
The pipeline uses six HashMap/HashSet variables (article_scraped, source_counts, url_source, filled_counts, seen_urls, and the pending_traces vec) as raw tracking state. These represent a coherent concept -- "pipeline context" or "generation state" -- and should be bundled into a struct:
struct GenerationContext {
articles_by_category: HashMap<String, Vec<NewsItem>>,
source_domain_counts: HashMap<String, usize>,
url_to_source: HashMap<String, String>,
category_fill_counts: HashMap<String, usize>,
seen_urls: HashSet<String>,
pending_traces: Vec<ArticleHistoryEntry>,
}
Magic Strings
Category keys like "category_0", "category_autre", "category_no_date", and status strings like "filtered_history", "filtered_diversity", "filtered_not_article", "filtered_too_old", "filtered_empty", "filtered_homepage", "filtered_cross_phase_dedup", "used" are scattered as string literals. These should be constants or an enum.
Stringly-Typed Configuration
Several settings use strings where enums would be safer:
settings.ai_provider(could beenum Provider { Gemini, OpenAi, Anthropic })settings.search_agent_behavior(free-form, but could at least validate non-HTML)synthesis.status(always"completed"in the codebase, but stored asString)
3. Architecture
3.1 Layer Separation
The codebase follows a three-layer architecture:
handlers/ (HTTP layer) --> services/ (business logic) --> db/ (data access)
|
models/ (shared types)
Assessment: Good but with leaks.
- Handlers are thin and focused. They validate input, call services/db, and format responses. This is excellent.
- The
db/layer is clean -- pure SQL queries returning typed results. No business logic leaks into SQL. - The
services/layer is where responsibilities blur.synthesis.rscallsdb::modules directly (bypassing any service abstraction), constructs its own SQL inresolve_model, and embeds scraping/classification logic that could be separate services.
Concern: The scheduler sits at the service layer but orchestrates at the handler level
The scheduler calls synthesis::run_generation_inner (a service) but also does email sending (another service), DB fetching (data layer), and schedule marking (data layer) all inline. It should either be a handler (if it needs to compose services) or delegate to a higher-level "generation + notification" service.
3.2 Error Handling
Strengths:
- Unified
AppErrorenum withIntoResponse-- all errors produce consistent JSON - Internal errors log full details but return generic messages to clients (security-conscious)
From<sqlx::Error>andFrom<anyhow::Error>conversions are clean- Error sanitization in
sanitize_error_messageprevents API key leakage
Weaknesses:
- Errors are silently swallowed in multiple places via
.ok():db::article_history::batch_insert_entries(...).await.ok()(7 occurrences) -- if tracing fails, there is no indicationdb::llm_call_log::truncate_old(...).await.ok()-- cleanup failure is invisibledb::schedules::mark_run(...).await.ok()-- if this fails, the schedule may fire again next minute
unwrap_or_default()onserde_json::from_valuecalls silently drops malformed data (e.g.,theme.categoriesdeserialization). A warning log would be more appropriate.
3.3 State Management
In-memory state:
JobStore(DashMap-based) -- well-designed with TTL cleanup, user locking, and cancellation supportRateLimiter/ProviderRateLimiter-- properlyArc-wrapped forClone-cheap sharinguser_rate_limiters: DashMap<Uuid, UserRateLimitEntry>-- handles setting changes atomically
Concern: No persistence for job state
If the server restarts during a generation, the in-memory job is lost with no way to recover. For a self-hosted single-instance app this may be acceptable, but if resilience is a goal, the job state should be backed by the database.
Concern: Scheduled job email state is fire-and-forget
The scheduler sends emails to up to 3 recipients per schedule. If one email fails, the others still send, but there is no retry or notification mechanism. mark_run is called unconditionally after a successful generation, even if all emails failed.
3.4 Concurrency Model
Strengths:
- Proper use of
tokio::task::JoinSetfor parallel scraping and classification DashMap+DashSetfor lock-free concurrent access to shared stateAtomicBoolfor cooperative cancellation (avoids mutex overhead)watch::channelfor fan-out progress notifications
Weakness: Global rate limiter shared across scheduled + manual jobs
The ProviderRateLimiter is global. A scheduled job and a manual job for different users hitting the same provider share the same bucket. Under load, scheduled jobs could starve manual users (or vice versa). The architecture should consider per-user-or-per-job rate tracking for fairness.
3.5 Security Architecture
Strengths:
- AES-256-GCM encryption for API keys at rest with per-key nonces
- SSRF prevention in both
scraper.rsandsource_scraper.rs(IP allowlist checking, redirect validation) - CSRF protection via
X-Requested-Withheader check on all mutating API endpoints - Session cookies are
HttpOnly,SameSite=Lax, optionallySecure - Anti-enumeration in auth (same response for existent/non-existent emails, timing attack mitigation)
- Error sanitization prevents API key leakage in SSE error events
- CSP, X-Frame-Options, HSTS, Referrer-Policy headers
Concern: Gemini API key in URL
The GeminiProvider constructs the API URL as ...?key={api_key}. While the error handler carefully avoids logging the full URL, the key is still in the URL query string. This means:
- It may appear in HTTP access logs on intermediary proxies
reqwestmay include it in error messages despite thekind-only logging- If tracing is set to DEBUG level, the URL may be logged by tower-http's
TraceLayer
This is a known Gemini API design limitation, but the risk should be documented.
4. Dependency Management and Testability
4.1 Test Architecture
Strengths:
- Unit tests for all model validation logic (settings, theme, schedule, source, synthesis)
- Unit tests for error handling, rate limiting, URL normalization, link extraction
- Mock LLM provider enables end-to-end pipeline testing without real API calls
- Factory tests verify correct provider instantiation
Weaknesses:
-
The core pipeline (
run_generation_inner) cannot be unit-tested. It requires:- A live
PgPool(for alldb::calls) - A real
AppState(for config, rate limiters, job store) - Network access (for scraping via
http_client) - Only the LLM provider can be mocked (via
provider_override)
- A live
-
No integration tests for the scheduler
-
No tests for the Brave Search integration path
-
No tests for Phase 2 (web search fallback) at all
4.2 Dependency Injection
The codebase uses Axum's State(AppState) extractor as its sole DI mechanism. This works well for handlers but breaks down for services:
- Services receive
&AppStatedirectly, gaining access to everything - There is no trait boundary between the pipeline and its dependencies (db, scraper, search)
- The
provider_override: Option<Arc<dyn LlmProvider>>parameter proves the value of DI -- it is the only seam that enables testing
Recommendation: Introduce a PipelineDeps trait (or struct of trait objects) that the pipeline receives, encapsulating:
- Database access (settings, sources, themes, article history)
- Scraping (source page scraping, article scraping)
- Search (Brave, LLM web search)
- Rate limiting
- Key resolution
This would allow the entire pipeline to be tested with in-memory fakes.
4.3 Module Coupling
The dependency graph is mostly clean:
handlers --> services --> db
| |
models <-----+
|
errors
Exceptions:
synthesis.rscallsdb::directly (bypasses service layer for settings, themes, sources, history, llm_call_log, syntheses)synthesis.rscallscrate::services::prompts,crate::services::llm::schema,crate::services::scraper,crate::services::source_scraper,crate::services::brave_search,crate::services::encryption-- essentially importing the entire services layerrate_limiter.rscontains its own SQL querieshandlers/syntheses.rs::listconstructs aSynthesismodel manually from theSynthesisWithThemeNamejoin row, duplicating field mapping
5. Specific Code-Level Findings
5.1 The #[allow(clippy::too_many_arguments)] Smell
Two functions suppress this lint:
build_search_prompt(9 parameters)log_llm_call(10 parameters)
Both are symptoms of missing context objects. build_search_prompt should take a SearchPromptConfig struct. log_llm_call should take a LlmCallRecord struct.
5.2 run_generation_inner Parameter List
The function takes 7 parameters: job_id, state, user_id, theme_id, tx, provider_override, cancelled. The first four are "what to generate," the last three are "infrastructure." A GenerationJob struct and a PipelineInfra struct would make intent clearer.
5.3 Inconsistent serde_json::Value vs. Typed Models
theme.categories is stored as serde_json::Value and deserialized inline with serde_json::from_value(...).unwrap_or_default(). This pattern appears at least 5 times across the codebase (themes, schedules, syntheses). Consider using sqlx::types::Json<Vec<String>> for typed extraction at the query level.
5.4 French/English Mixing
User-facing messages are consistently in French (good for i18n consistency), but code-level strings mix languages:
- Status strings:
"filtered_history"(English),"category_autre"(French),"Articles sans date"(French) - Log messages: English
- Error messages to users: French
The recommendation is to keep all internal identifiers in English and use the i18n layer for user-facing strings.
6. Prioritized Recommendations
P0 -- High Impact, Do First
-
Extract
GenerationContextstruct from the 6 tracking variables inrun_generation_inner. This is a safe refactor that immediately improves readability and reduces parameter passing. -
Extract
scrape_and_classify_batchfunction to eliminate the 300-line duplication between Phase 1, Brave Search, and LLM search paths. Parameterize bysource_type: &strandsource_url: Option<&str>. -
Move
JobStoreto its own module (services/job_store.rs). It is already self-contained with no dependencies on synthesis logic. This reducessynthesis.rsby ~190 lines.
P1 -- Important, Do Soon
-
Introduce
TraceBatcherstruct to encapsulate the pending traces buffer, batch insert calls, and error handling. Replace the 7 manual flush sites withbatcher.flush(). -
Make the scheduler use
JobStorefor scheduled jobs. This provides visibility into scheduled generation progress and prevents race conditions between scheduled and manual jobs. Add email sending as a post-completion hook rather than inline in the scheduler. -
Replace magic strings with constants or enums for article statuses (
"filtered_history", etc.), category keys ("category_0","category_autre","category_no_date"), and synthesis status. -
Add structured logging to
.ok()calls. Replacebatch_insert_entries(...).await.ok()withif let Err(e) = batch_insert_entries(...).await { tracing::warn!(...) }.
P2 -- Improve Quality
-
Split Phase 2 search strategies into a
SearchStrategytrait withBraveSearchStrategyandLlmSearchStrategyimplementations. The pipeline would callstrategy.search(query, max_results)without knowing which backend is used. -
Extract
provider_resolver.rsfor the provider/key/model resolution logic (~100 lines currently insynthesis.rs). -
Introduce
PipelineDepstrait or struct to enable full pipeline testing without Postgres/network. Start with the article history check as the first dependency to extract, since it is the most frequently called. -
Remove inline SQL from
rate_limiter.rsandsynthesis.rs::resolve_model. Route all queries throughdb/modules.
P3 -- Nice to Have
-
Type the
categoriesfield assqlx::types::Json<Vec<String>>instead ofserde_json::Valueto eliminate runtime deserialization. -
Consolidate response types -- either all in
models/or all inhandlers/, with a consistent convention. -
Add a
SearchPromptConfigstruct to replace the 9-parameterbuild_search_promptfunction. -
Document the TOCTOU risk in the Gemini API key URL pattern and consider using the
x-goog-api-keyheader instead (if supported by the Gemini API version in use).
7. What the Codebase Does Well
It is important to acknowledge the strengths that should be preserved during refactoring:
- Error handling discipline: The
AppErrorenum is consistently used everywhere. No panics in production code. Internal details are never leaked. - Security-first design: SSRF prevention, encrypted secrets, CSRF protection, anti-enumeration, session management -- all implemented correctly.
- Idiomatic Axum usage: Extractors, state management, middleware composition, SSE streaming -- all follow framework conventions.
- Test coverage on leaf components: Models, utils, and isolated services have thorough unit tests with boundary cases.
- Documentation: Module-level doc comments, function-level doc comments, and inline comments explaining non-obvious decisions (e.g., the TOCTOU note in
scraper.rs). - Operational features: Graceful shutdown, session cleanup, job TTL, rate limit hot-reload, cooperative cancellation -- these show production-mindedness.
Appendix: File Size Summary
| File | Lines | Assessment |
|---|---|---|
services/synthesis.rs |
~1550 | Critical -- needs decomposition |
services/scraper.rs |
~400 | Acceptable |
services/rate_limiter.rs |
~470 | Acceptable (includes thorough tests) |
services/prompts.rs |
~370 | Acceptable (includes thorough tests) |
handlers/auth.rs |
~380 | Acceptable |
handlers/sources.rs |
~280 | Good |
handlers/admin.rs |
~440 | Acceptable |
handlers/syntheses.rs |
~240 | Good |
handlers/generation.rs |
~180 | Good |
models/settings.rs |
~260 | Good (includes thorough tests) |
models/synthesis.rs |
~415 | Acceptable (includes thorough tests) |
errors.rs |
~173 | Good |
app_state.rs |
~82 | Good |
router.rs |
~178 | Good |
scheduler.rs |
~93 | Good size, but needs architectural changes |