You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

22 KiB

Architecture & SOLID Audit Report

Executive Summary

AI Weekly Synth demonstrates strong foundational architecture with clean layer separation (handlers, services, DB), a well-designed trait abstraction for LLM providers, and robust error handling. However, the synthesis pipeline (services/synthesis.rs) has accumulated significant complexity that violates Single Responsibility and makes the code difficult to test, extend, and maintain. The codebase would benefit most from decomposing the pipeline into smaller, composable stages and introducing trait-based dependency injection for external services.

SOLID Analysis

Single Responsibility

Mostly followed in the outer layers; severely violated in the synthesis pipeline.

Good examples:

  • errors.rs has a single job: map application errors to HTTP responses. Clean, focused, well-tested.
  • middleware/auth.rs does exactly one thing: extract authenticated users from session cookies. The AdminUser wrapper cleanly composes on top.
  • services/encryption.rs is a focused module for AES-256-GCM operations with no extraneous concerns.
  • services/auth.rs cleanly coordinates token creation, email sending, and session management without overstepping.
  • Each file in db/ handles queries for exactly one table (users, sessions, syntheses, etc.).

Critical violation: services/synthesis.rs (~1250 lines)

This single file is responsible for:

  1. Job lifecycle management (create, subscribe, cleanup via JobStore)
  2. Progress event definition and emission
  3. Provider and API key resolution
  4. Rate limiting coordination
  5. Source rotation strategy
  6. Parallel scraping orchestration (Phase 1)
  7. LLM classification and summarization orchestration (Phase 1)
  8. Brave Search integration orchestration (Phase 2)
  9. LLM web search fallback orchestration (Phase 2)
  10. URL normalization, hashing, and deduplication
  11. Article history filtering
  12. Source diversity enforcement
  13. Section assembly and persistence
  14. Article provenance tracing
  15. Error sanitization

The run_generation_inner function alone spans ~640 lines. The nearly identical scrape-then-classify batch processing loop is duplicated between Phase 1 (lines 406-551) and the Brave Search path (lines 627-758), with only minor differences in tuple shapes. This duplication is a strong signal that a shared abstraction is missing.

Other SRP observations:

  • handlers/auth.rs mixes HTTP concern (cookie construction in build_session_cookie) with business logic (token verification in verify_token). The cookie construction could live in a utility or the auth service.
  • AppState (in app_state.rs) holds both infrastructure (pool, config) and domain state (job_store, rate_limiters). This is acceptable at current scale but should be watched as the application grows.

Open/Closed

Good for LLM providers; poor for the synthesis pipeline.

Good examples:

  • The LlmProvider trait plus factory::create_provider follow OCP well. Adding a new provider (e.g., Mistral) requires implementing the trait and adding one match arm in the factory -- no existing code needs modification.
  • AppError uses enum variants with thiserror derives. Adding a new error category is additive.
  • The FromRequestParts extractors (AuthUser, AdminUser) compose cleanly -- adding new authorization levels means adding new extractors.

Violations:

  • The synthesis pipeline has hardcoded knowledge of all search strategies (LLM search, Brave Search, personalized sources). Adding a new search backend (e.g., Serper, SearXNG) requires modifying the 640-line orchestration function. There is no SearchBackend trait or strategy pattern.
  • factory::create_provider uses a match statement that must be modified for each new provider. A registration-based factory (e.g., HashMap<&str, fn(String, Client) -> Arc<dyn LlmProvider>>) would eliminate this.
  • The ProviderRateLimiter embeds SQL queries directly (line 173 of rate_limiter.rs) rather than going through the db module, bypassing the architectural boundary. The comment says "to avoid circular dependency" -- this signals a structural issue.

Liskov Substitution

Correctly applied where traits are used.

  • All three LlmProvider implementations (GeminiProvider, OpenAiProvider, AnthropicProvider) correctly implement the trait contract. Callers use Arc<dyn LlmProvider> throughout and never downcast.
  • The AnthropicProvider implementation is noteworthy: it adapts to Anthropic's lack of native JSON schema enforcement by embedding the schema in the system prompt. This is a legitimate behavioral difference that is hidden behind the trait -- exactly what LSP intends. The contract is "return structured JSON matching the schema," not "use native schema enforcement."
  • FromRequestParts implementations for AuthUser and AdminUser correctly maintain the contract expectations.

No LSP violations detected. The trait abstractions are lean and appropriately designed.

Interface Segregation

Traits are lean and focused -- no violations.

  • LlmProvider has only two methods: provider_id() and call_llm(). This is appropriately minimal. If web search support needed to be optional (e.g., only some providers support grounded search), a separate WebSearchProvider trait would be warranted, but currently the pipeline handles this externally.
  • The codebase uses free functions in modules (e.g., db::users::find_by_email) rather than trait objects for the database layer. This is idiomatic Rust and avoids over-engineering trait hierarchies.

Observation: The absence of traits for services like scraping, email, and encryption is acceptable given the project's scale, but limits testability (see Dependency Inversion section).

Dependency Inversion

Partially applied -- the weakest SOLID principle in this codebase.

Good:

  • Handlers depend on AppState (an abstraction container), not directly on database implementations.
  • The synthesis pipeline depends on Arc<dyn LlmProvider>, not on concrete provider types.

Significant gaps:

  • No abstraction over the database layer. All services and the synthesis pipeline call db::* functions directly (e.g., db::settings::get_or_create_default, db::article_history::check_urls_exist). This means the synthesis pipeline cannot be unit-tested without a live Postgres instance.
  • No abstraction over the scraper. The synthesis pipeline calls scraper::scrape_url and source_scraper::extract_article_links as concrete functions. This prevents testing pipeline logic with deterministic, fake scraped content.
  • No abstraction over the email service. services/email.rs is called directly. Testing email-sending behavior requires either a live Resend API key or the hardcoded test bypass.
  • AppState is a concrete struct, not a trait. All handlers are bound to State<AppState>. While this is standard Axum practice, it means handler logic cannot be tested without constructing a full AppState with a real database pool.

The integration tests in tests/common/mod.rs confirm this: TestApp must create a real Postgres database for every test, run migrations, and tear down afterward. The test helper works well, but the inability to unit-test service logic in isolation is a direct consequence of missing abstractions.

Design Patterns

Well-Used Patterns

  1. Strategy Pattern (LLM Providers): The LlmProvider trait + factory::create_provider is a textbook Strategy implementation. Providers encapsulate provider-specific API differences (request format, response parsing, error mapping) behind a uniform interface. The factory cleanly maps configuration to concrete strategy.

  2. Observer Pattern (Progress Streaming): The watch channel pattern for SSE progress streaming is elegant. The JobStore manages pub/sub lifecycle, and the WatchStream adapter in the handler converts it to an SSE stream. This design correctly handles reconnection (subscribers immediately get the latest state).

  3. Extractor Pattern (Authentication): Axum's FromRequestParts is used idiomatically for AuthUser and AdminUser. This is effectively a Decorator pattern -- handlers declare their auth requirements via type parameters, and the framework handles the cross-cutting concern transparently.

  4. Builder Pattern (Configuration): AppConfig::from_env() + validate() is a clean two-phase construction pattern. Validation is separated from loading, allowing the application to fail fast with clear error messages.

  5. Value Object Pattern: MasterKey is a well-designed value object. It enforces invariants on construction (from_hex), prevents accidental cloning (no Clone derive), and zeroizes memory on drop. This is security-conscious Rust.

  6. Repository Pattern (DB layer): Each db::* module acts as a repository, encapsulating SQL queries behind typed function signatures. The separation between the UserRow (DB representation) and User (domain model) in db/users.rs is a good Data Mapper implementation.

Missing Patterns

  1. Pipeline/Chain of Responsibility: The synthesis pipeline is a sequential series of stages (load settings -> resolve provider -> scrape sources -> classify articles -> web search fallback -> assemble -> save). This is a textbook pipeline pattern that would benefit from explicit stage modeling:

    trait PipelineStage {
        async fn execute(&self, ctx: &mut PipelineContext) -> Result<(), AppError>;
    }
    

    Each stage would be independently testable and composable.

  2. Unit of Work / Transaction: The synthesis pipeline performs many database writes (article history inserts, LLM call logs, final synthesis save) without a unifying transaction. If the synthesis save fails after article history has been written, the data is inconsistent. A transactional boundary around the save phase would prevent orphaned records.

  3. Circuit Breaker: The rate limiter waits up to 60 seconds for rate limit windows to pass, but there is no circuit breaker for provider failures. If a provider returns 5xx errors repeatedly, the pipeline will keep retrying individual articles until the batch is exhausted. A circuit breaker that short-circuits after N consecutive failures would save time and API quota.

  4. Result Collector / Accumulator: The batch processing loops manually manage scraped_articles, filled_counts, source_counts, and seen_urls as mutable state threaded through the loop. An accumulator pattern would encapsulate this state and its invariants (e.g., "never exceed max_items_per_category") into a dedicated struct with methods like try_add_article().

Anti-Patterns

  1. God Function: run_generation_inner (~640 lines) is the most impactful anti-pattern. It manages too many concerns, has deeply nested control flow, and is untestable in isolation. The function's length makes it difficult to reason about edge cases and encourages bugs during modification.

  2. Duplicated Logic (DRY Violation): The scrape-then-classify batch loop appears in near-identical form in three places:

    • Phase 1 personalized sources (lines ~406-551)
    • Phase 2 Brave Search path (lines ~627-758)
    • The response parsing differs slightly, but the core pattern (spawn scrape tasks -> collect results -> spawn classify tasks -> collect and categorize) is the same.

    A shared process_article_batch() function accepting a list of (url, source_identifier) would eliminate ~150 lines of duplication.

  3. Primitive Obsession: Categories are tracked throughout the pipeline using raw strings and index-based keys ("category_0", "category_autre"). A CategoryIndex newtype or enum would prevent bugs like off-by-one errors and make the intent clearer.

  4. Implicit Protocol (Stringly Typed): The status field in article history uses raw strings ("filtered_history", "filtered_diversity", "used", "filtered_homepage", etc.). An enum would provide compile-time guarantees and prevent typos.

  5. Configuration Object Overload: UserSettings has 16 fields that are all passed around as a single struct. The synthesis pipeline destructures individual fields from it in many places (e.g., settings.max_items_per_category, settings.batch_size, settings.use_brave_search). This makes it hard to understand which settings each stage actually depends on.

Architecture Assessment

Layer Separation

Clean three-layer architecture with appropriate boundaries.

handlers/ (HTTP) -> services/ (Business Logic) -> db/ (Data Access)
  • Handlers correctly limit themselves to request parsing, validation, calling services/DB, and response formatting. No handler contains business logic.
  • Services implement the business rules. The auth service coordinates between DB, email, and token utilities. The encryption service is self-contained.
  • DB modules are pure data access with no business logic. SQL queries are parameterized and safe.

Minor boundary violations:

  • ProviderRateLimiter::reload_from_db in services/rate_limiter.rs contains raw SQL queries instead of calling through db::rate_limits. The comment acknowledges this is to "avoid circular dependency," suggesting the module boundaries need adjustment.
  • resolve_model in services/synthesis.rs also contains raw SQL (sqlx::query_scalar with inline SQL) instead of going through db::providers. This bypasses the DB layer abstraction.

Coupling

Low coupling between modules; high coupling within the synthesis pipeline.

  • Handler modules are independent of each other. Each handler file imports only from app_state, errors, middleware, relevant models, and relevant services/db modules.
  • The LLM provider implementations are fully decoupled -- they share no state and have no knowledge of each other.
  • The db modules have zero inter-module dependencies (no db::users calling db::sessions, etc.).

High coupling in the synthesis pipeline:

  • services/synthesis.rs imports from 9 different modules: app_state, db (5 submodules), errors, models (2 submodules), services (5 submodules). It is the most connected module in the codebase and acts as a "gravity well" that pulls in all dependencies.
  • The AppState struct couples all concerns: config, pool, HTTP client, auth rate limiter, provider rate limiter, user rate limiters, and job store. Any change to any of these affects the shared state type.

Error Handling

Excellent error handling strategy -- one of the strongest aspects of the codebase.

  • Single error type: AppError is the only error type in the application. All handlers return Result<impl IntoResponse, AppError>. This is clean and consistent.
  • Automatic conversions: From<sqlx::Error> and From<anyhow::Error> provide ergonomic ? operator usage while ensuring errors are properly categorized.
  • Security-conscious: Internal errors log the full details server-side but return only "An internal error occurred" to the client. The sanitize_error_message function in the synthesis pipeline ensures API keys are never exposed in SSE error events.
  • No unwrap() in production paths: Consistent with the project's stated policy. The few unwrap() calls are in test code or behind expect() with clear messages (e.g., "valid minutes value").
  • Error variants are semantically meaningful: NotFound, Unauthorized, Forbidden, BadRequest, Validation, Internal, RateLimited -- each maps to a specific HTTP status code with appropriate behavior.

Minor observation: The Validation variant accepts a String, but a structured error type (e.g., with field name + message) would be more useful for frontends that want to highlight specific form fields.

State Management

In-memory state is well-designed with appropriate concurrency primitives.

  • DashMap is used for lock-free concurrent access in RateLimiter, ProviderRateLimiter, JobStore, and user_rate_limiters. This avoids Mutex contention under load.
  • Arc<watch::Sender> for progress events is the right choice -- it provides "latest value" semantics so reconnecting SSE clients immediately get caught up.
  • The JobStore has TTL-based cleanup (1 hour) and post-completion cleanup (5 minutes), preventing memory leaks from abandoned jobs.

Potential concern: The RateLimiter in rate_limiter.rs stores timestamps in a Vec<Instant> (for the per-key limiter) rather than a VecDeque. Old timestamps are evicted via retain() which is O(n), whereas VecDeque::pop_front() is O(1). The ProviderRateLimiter correctly uses VecDeque. This inconsistency is minor but worth unifying.

State growth risk: The user_rate_limiters: DashMap<Uuid, UserRateLimitEntry> in AppState grows unboundedly as users trigger generation jobs. Entries are only removed when a user's settings change to have no rate limits. A periodic cleanup (e.g., remove entries for users who haven't generated in 24 hours) would prevent slow memory growth.

Prioritized Recommendations

  1. Decompose services/synthesis.rs into a multi-stage pipeline. Extract the monolithic run_generation_inner into discrete, testable stages: SettingsStage, SourceScrapingStage, ClassificationStage, WebSearchStage, AssemblyStage. Each stage should operate on a shared PipelineContext struct. This addresses the God Function anti-pattern, the DRY violation in batch processing, and enables unit testing of individual stages.

    • Files: backend/src/services/synthesis.rs (1250+ lines)
    • Impact: Testability, maintainability, extensibility
  2. Extract a shared process_article_batch() function as an immediate, lower-risk improvement. The near-identical scrape-then-classify loop in Phase 1 and the Brave Search path can be unified into a single function that accepts Vec<(String, Option<String>)> (URL + optional source URL) and returns classified NewsItems. This can be done without the full pipeline refactor.

    • Files: backend/src/services/synthesis.rs (lines ~406-551 and ~627-758)
    • Impact: ~150 lines of duplication removed, fewer bugs from divergent copies
  3. Introduce trait-based abstractions for external services to enable unit testing. Define traits for ArticleScraper, SearchService, and ArticleStore (history). The synthesis pipeline would depend on these traits, allowing tests to inject fakes. Start with the DB layer trait since it blocks the most test coverage.

    • Files: backend/src/services/synthesis.rs, backend/src/services/scraper.rs, backend/src/db/article_history.rs
    • Impact: Enables unit testing of synthesis pipeline without Postgres
  4. Replace stringly-typed status values with enums. The article history status field uses raw strings ("filtered_history", "filtered_diversity", "used", etc.). Define an ArticleStatus enum with Display/FromStr implementations. Similarly, replace the "category_0" / "category_autre" string keys with a CategoryKey newtype.

    • Files: backend/src/services/synthesis.rs, backend/src/db/article_history.rs
    • Impact: Type safety, compile-time correctness guarantees
  5. Move raw SQL out of service modules. ProviderRateLimiter::reload_from_db and resolve_model in synthesis.rs contain inline SQL queries that bypass the db/ layer. Move these queries to db/rate_limits.rs and db/providers.rs respectively. If circular dependencies are the issue, restructure the module hierarchy (e.g., make the rate limiter accept configuration data rather than a pool).

    • Files: backend/src/services/rate_limiter.rs (lines 171-199), backend/src/services/synthesis.rs (lines 1191-1220)
    • Impact: Architectural consistency, single source of truth for SQL
  6. Introduce an article accumulator struct to encapsulate the mutable tracking state in the synthesis pipeline. Currently, article_scraped, source_counts, url_source, filled_counts, and seen_urls are managed as independent HashMap/HashSet variables threaded through loops. A SynthesisAccumulator struct with methods like try_add_article(url, category, source) -> bool would encapsulate invariants (max per category, max per source, deduplication) and simplify the pipeline code.

    • Files: backend/src/services/synthesis.rs
    • Impact: Reduced cognitive complexity, fewer state-management bugs
  7. Add a circuit breaker for LLM provider failures. If a provider returns errors for N consecutive calls, stop sending requests and fail fast. The current behavior retries each article independently, wasting time and API quota when the provider is down.

    • Files: backend/src/services/synthesis.rs
    • Impact: Resilience, user experience during outages
  8. Unify Vec<Instant> to VecDeque<Instant> in RateLimiter. The per-key RateLimiter uses Vec::retain() (O(n)) while ProviderRateLimiter uses VecDeque::pop_front() (O(1)). Switch the per-key limiter to VecDeque for consistency and performance under high request volumes.

    • Files: backend/src/services/rate_limiter.rs (lines 37-77)
    • Impact: Minor performance improvement, code consistency
  9. Add periodic cleanup for user_rate_limiters in AppState. The DashMap<Uuid, UserRateLimitEntry> grows unboundedly. Add a background task (similar to JobStore TTL cleanup) that removes entries for users inactive for more than 24 hours.

    • Files: backend/src/app_state.rs, backend/src/main.rs
    • Impact: Prevents slow memory growth in long-running deployments
  10. Consider making UpdateSettingsRequest::validate() return Result<(), AppError> instead of Result<(), String>. Currently, handlers must map the string error: body.validate().map_err(AppError::Validation). Having validate return the correct error type directly would be more ergonomic and consistent with the rest of the codebase.

    • Files: backend/src/models/settings.rs
    • Impact: Minor ergonomic improvement, consistency