From f678275f5901cfad833685a70242079771014ce2 Mon Sep 17 00:00:00 2001 From: oabrivard Date: Thu, 26 Mar 2026 09:58:29 +0100 Subject: [PATCH] docs: add spec for integration tests with mock LLM provider + update algorithm.md Co-Authored-By: Claude Opus 4.6 (1M context) --- .../2026-03-26-integration-tests-design.md | 132 ++++++++++++++++++ 1 file changed, 132 insertions(+) create mode 100644 docs/superpowers/specs/2026-03-26-integration-tests-design.md diff --git a/docs/superpowers/specs/2026-03-26-integration-tests-design.md b/docs/superpowers/specs/2026-03-26-integration-tests-design.md new file mode 100644 index 0000000..b3cf632 --- /dev/null +++ b/docs/superpowers/specs/2026-03-26-integration-tests-design.md @@ -0,0 +1,132 @@ +# Design: Integration Tests with Mock LLM Provider + +**Date**: 2026-03-26 +**Scope**: Add integration tests for the generation pipeline using a mock LLM provider and wiremock for HTTP mocking + +--- + +## Context + +The generation pipeline (`run_generation_inner`) has no integration tests. Existing tests only verify HTTP handler responses (202 Accepted) and model resolution, but never exercise the actual pipeline logic (scraping, filtering, classification, category assignment, saving). + +The pipeline creates its LLM provider internally via `create_provider()`, making it impossible to inject a mock. This needs a small refactoring to enable dependency injection. + +--- + +## 1. MockLlmProvider + +### New file: `backend/src/services/llm/mock.rs` + +A `MockLlmProvider` implementing `LlmProvider` that returns canned JSON responses: + +- **Classify calls** (detected by system prompt containing "classer"): returns `{title: "...", summary: "...", category: ""}` using the article title from the user prompt +- **Search calls** (detected by system prompt containing "precis"): returns `{category_0: [{title, url, summary}]}` with configurable test URLs +- **Link extraction calls** (detected by system prompt containing "liens"): returns `{urls: [...]}` with configurable URLs + +The mock is configurable via constructor: +```rust +MockLlmProvider::new() + .with_default_category("Test Category") + .with_search_urls(vec!["http://mock-server/article-1", ...]) +``` + +Registered in `llm/mod.rs` as `pub mod mock;` (always available, not `#[cfg(test)]`, since integration tests are in a separate crate). + +--- + +## 2. Dependency Injection Refactoring + +### Modify: `backend/src/services/synthesis.rs` + +The actual signatures are: +- `pub async fn run_generation(job_id: Uuid, state: AppState, user_id: Uuid, tx: Arc>)` +- `async fn run_generation_inner(job_id: Uuid, state: &AppState, user_id: Uuid, tx: &watch::Sender) -> Result` + +Changes: +- Add `provider_override: Option>` as the last parameter to both functions +- Make `run_generation_inner` public: `pub async fn run_generation_inner(...)` — needed so focused pipeline tests can call it directly from the integration test crate +- Inside `run_generation_inner`, when `provider_override` is `Some(provider)`: + - Use the provider directly (skip `resolve_provider_and_key` + `create_provider`) + - Use `"mock"` as `provider_name` (for rate limiter key) + - For model resolution: use settings `ai_model` / `ai_model_websearch` directly if non-empty, otherwise fall back to hardcoded defaults like `"mock-model"`. Skip `resolve_model()` which queries `admin_providers`. +- When `None`: current behavior unchanged (production path) + +### Modify: `backend/src/handlers/generation.rs` + +Pass `None` as `provider_override` in the `tokio::spawn` call. Production behavior unchanged. + +--- + +## 3. Wiremock for HTTP Mocking + +### Modify: `backend/Cargo.toml` + +Add `wiremock` as dev-dependency: +```toml +[dev-dependencies] +wiremock = "0.6" +``` + +Integration tests use `wiremock::MockServer` to serve fake HTML pages: +- A **source page** with `` links to article URLs on the same mock server +- **Article pages** with ``, `<body>` text content + +Source URLs and article URLs point to `http://127.0.0.1:{port}/...`. + +Note: The SSRF check will reject `127.0.0.1` as a private IP. The wiremock mock server needs to be on a non-private address, OR the SSRF check needs to be bypassed for tests. Options: +- Use the scraper HTTP client which already has a redirect policy but doesn't check the initial request IP for `source_scraper` (it resolves the hostname) — if the hostname is `127.0.0.1`, `check_ssrf` will reject it +- **Simplest approach**: for focused pipeline tests, use the mock LLM provider to return article URLs directly (via search results or link extraction), bypassing source page scraping entirely. For scraping tests, call `scrape_single_article` directly with the wiremock URL (this function doesn't do SSRF checks — those are in `source_scraper`) + +--- + +## 4. Test Scenarios + +### Focused pipeline tests (`backend/tests/pipeline_test.rs`) + +These call `run_generation_inner` directly with a mock provider. They need a real Postgres DB (for settings, article_history, syntheses tables). + +**Test setup helper**: +- Create test DB (reuse existing `TestApp` infrastructure from `tests/common/mod.rs`) +- Create user with settings (categories, max_items, etc.) +- Create a `watch::channel` for progress events +- Build a `MockLlmProvider` with desired configuration + +**`phase1_classifies_scraped_articles`**: +- Add sources pointing to wiremock article pages +- Mock LLM classify returns articles in the configured category +- Call `run_generation_inner` with mock provider +- Verify synthesis saved with correct sections and articles + +**`phase2_search_fills_gaps`**: +- No sources configured → Phase 1 produces nothing +- Mock LLM search returns structured articles +- Verify synthesis saved with search results + +**`all_articles_filtered_returns_error`**: +- Pre-populate `article_history` with hashes of all candidate URLs +- Trigger pipeline → everything filtered → verify error result + +**`category_overflow_spills_to_autre`**: +- Set `max_items_per_category=1`, provide multiple articles classified to same category +- Verify overflow articles land in "Autre" + +### End-to-end test (`backend/tests/api_generation_test.rs`) + +This is harder because the HTTP handler passes `None` for provider_override. Two approaches: + +**(a)** Skip the true end-to-end test for now — the focused pipeline tests cover the critical logic. The existing trigger test (`generate_returns_202_with_job_id`) already verifies the HTTP handler wiring. + +**(b)** Add a `mock_provider` field to `AppState` (`Option<Arc<dyn LlmProvider>>`) that `run_generation_inner` checks before creating a real provider. This is more invasive but enables full HTTP-path testing. + +**Recommendation**: Start with **(a)** — focused pipeline tests are the highest value. The end-to-end HTTP test can be added later if needed. + +--- + +## 5. Files Summary + +- **Create:** `backend/src/services/llm/mock.rs` — MockLlmProvider +- **Modify:** `backend/src/services/llm/mod.rs` — register mock module +- **Modify:** `backend/src/services/synthesis.rs` — add `provider_override` parameter, make `run_generation_inner` public +- **Modify:** `backend/src/handlers/generation.rs` — pass `None` for override +- **Create:** `backend/tests/pipeline_test.rs` — focused pipeline tests +- **Modify:** `backend/Cargo.toml` — add wiremock dev-dependency