You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ai_synth/docs/superpowers/specs/2026-03-26-integration-test...

6.5 KiB

Design: Integration Tests with Mock LLM Provider

Date: 2026-03-26 Scope: Add integration tests for the generation pipeline using a mock LLM provider and wiremock for HTTP mocking


Context

The generation pipeline (run_generation_inner) has no integration tests. Existing tests only verify HTTP handler responses (202 Accepted) and model resolution, but never exercise the actual pipeline logic (scraping, filtering, classification, category assignment, saving).

The pipeline creates its LLM provider internally via create_provider(), making it impossible to inject a mock. This needs a small refactoring to enable dependency injection.


1. MockLlmProvider

New file: backend/src/services/llm/mock.rs

A MockLlmProvider implementing LlmProvider that returns canned JSON responses:

  • Classify calls (detected by system prompt containing "classer"): returns {title: "...", summary: "...", category: "<configured category>"} using the article title from the user prompt
  • Search calls (detected by system prompt containing "precis"): returns {category_0: [{title, url, summary}]} with configurable test URLs
  • Link extraction calls (detected by system prompt containing "liens"): returns {urls: [...]} with configurable URLs

The mock is configurable via constructor:

MockLlmProvider::new()
    .with_default_category("Test Category")
    .with_search_urls(vec!["http://mock-server/article-1", ...])

Registered in llm/mod.rs as pub mod mock; (always available, not #[cfg(test)], since integration tests are in a separate crate).


2. Dependency Injection Refactoring

Modify: backend/src/services/synthesis.rs

The actual signatures are:

  • pub async fn run_generation(job_id: Uuid, state: AppState, user_id: Uuid, tx: Arc<watch::Sender<ProgressEvent>>)
  • async fn run_generation_inner(job_id: Uuid, state: &AppState, user_id: Uuid, tx: &watch::Sender<ProgressEvent>) -> Result<Uuid, AppError>

Changes:

  • Add provider_override: Option<Arc<dyn LlmProvider>> as the last parameter to both functions
  • Make run_generation_inner public: pub async fn run_generation_inner(...) — needed so focused pipeline tests can call it directly from the integration test crate
  • Inside run_generation_inner, when provider_override is Some(provider):
    • Use the provider directly (skip resolve_provider_and_key + create_provider)
    • Use "mock" as provider_name (for rate limiter key)
    • For model resolution: use settings ai_model / ai_model_websearch directly if non-empty, otherwise fall back to hardcoded defaults like "mock-model". Skip resolve_model() which queries admin_providers.
  • When None: current behavior unchanged (production path)

Modify: backend/src/handlers/generation.rs

Pass None as provider_override in the tokio::spawn call. Production behavior unchanged.


3. Wiremock for HTTP Mocking

Modify: backend/Cargo.toml

Add wiremock as dev-dependency:

[dev-dependencies]
wiremock = "0.6"

Integration tests use wiremock::MockServer to serve fake HTML pages:

  • A source page with <a href> links to article URLs on the same mock server
  • Article pages with <title>, <body> text content

Source URLs and article URLs point to http://127.0.0.1:{port}/....

Note: The SSRF check will reject 127.0.0.1 as a private IP. The wiremock mock server needs to be on a non-private address, OR the SSRF check needs to be bypassed for tests. Options:

  • Use the scraper HTTP client which already has a redirect policy but doesn't check the initial request IP for source_scraper (it resolves the hostname) — if the hostname is 127.0.0.1, check_ssrf will reject it
  • Simplest approach: for focused pipeline tests, use the mock LLM provider to return article URLs directly (via search results or link extraction), bypassing source page scraping entirely. For scraping tests, call scrape_single_article directly with the wiremock URL (this function doesn't do SSRF checks — those are in source_scraper)

4. Test Scenarios

Focused pipeline tests (backend/tests/pipeline_test.rs)

These call run_generation_inner directly with a mock provider. They need a real Postgres DB (for settings, article_history, syntheses tables).

Test setup helper:

  • Create test DB (reuse existing TestApp infrastructure from tests/common/mod.rs)
  • Create user with settings (categories, max_items, etc.)
  • Create a watch::channel for progress events
  • Build a MockLlmProvider with desired configuration

phase1_classifies_scraped_articles:

  • Add sources pointing to wiremock article pages
  • Mock LLM classify returns articles in the configured category
  • Call run_generation_inner with mock provider
  • Verify synthesis saved with correct sections and articles

phase2_search_fills_gaps:

  • No sources configured → Phase 1 produces nothing
  • Mock LLM search returns structured articles
  • Verify synthesis saved with search results

all_articles_filtered_returns_error:

  • Pre-populate article_history with hashes of all candidate URLs
  • Trigger pipeline → everything filtered → verify error result

category_overflow_spills_to_autre:

  • Set max_items_per_category=1, provide multiple articles classified to same category
  • Verify overflow articles land in "Autre"

End-to-end test (backend/tests/api_generation_test.rs)

This is harder because the HTTP handler passes None for provider_override. Two approaches:

(a) Skip the true end-to-end test for now — the focused pipeline tests cover the critical logic. The existing trigger test (generate_returns_202_with_job_id) already verifies the HTTP handler wiring.

(b) Add a mock_provider field to AppState (Option<Arc<dyn LlmProvider>>) that run_generation_inner checks before creating a real provider. This is more invasive but enables full HTTP-path testing.

Recommendation: Start with (a) — focused pipeline tests are the highest value. The end-to-end HTTP test can be added later if needed.


5. Files Summary

  • Create: backend/src/services/llm/mock.rs — MockLlmProvider
  • Modify: backend/src/services/llm/mod.rs — register mock module
  • Modify: backend/src/services/synthesis.rs — add provider_override parameter, make run_generation_inner public
  • Modify: backend/src/handlers/generation.rs — pass None for override
  • Create: backend/tests/pipeline_test.rs — focused pipeline tests
  • Modify: backend/Cargo.toml — add wiremock dev-dependency