docs: add spec for integration tests with mock LLM provider + update algorithm.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
master
oabrivard 3 months ago
parent 0b71215ddc
commit f678275f59

@ -0,0 +1,132 @@
# Design: Integration Tests with Mock LLM Provider
**Date**: 2026-03-26
**Scope**: Add integration tests for the generation pipeline using a mock LLM provider and wiremock for HTTP mocking
---
## Context
The generation pipeline (`run_generation_inner`) has no integration tests. Existing tests only verify HTTP handler responses (202 Accepted) and model resolution, but never exercise the actual pipeline logic (scraping, filtering, classification, category assignment, saving).
The pipeline creates its LLM provider internally via `create_provider()`, making it impossible to inject a mock. This needs a small refactoring to enable dependency injection.
---
## 1. MockLlmProvider
### New file: `backend/src/services/llm/mock.rs`
A `MockLlmProvider` implementing `LlmProvider` that returns canned JSON responses:
- **Classify calls** (detected by system prompt containing "classer"): returns `{title: "...", summary: "...", category: "<configured category>"}` using the article title from the user prompt
- **Search calls** (detected by system prompt containing "precis"): returns `{category_0: [{title, url, summary}]}` with configurable test URLs
- **Link extraction calls** (detected by system prompt containing "liens"): returns `{urls: [...]}` with configurable URLs
The mock is configurable via constructor:
```rust
MockLlmProvider::new()
.with_default_category("Test Category")
.with_search_urls(vec!["http://mock-server/article-1", ...])
```
Registered in `llm/mod.rs` as `pub mod mock;` (always available, not `#[cfg(test)]`, since integration tests are in a separate crate).
---
## 2. Dependency Injection Refactoring
### Modify: `backend/src/services/synthesis.rs`
The actual signatures are:
- `pub async fn run_generation(job_id: Uuid, state: AppState, user_id: Uuid, tx: Arc<watch::Sender<ProgressEvent>>)`
- `async fn run_generation_inner(job_id: Uuid, state: &AppState, user_id: Uuid, tx: &watch::Sender<ProgressEvent>) -> Result<Uuid, AppError>`
Changes:
- Add `provider_override: Option<Arc<dyn LlmProvider>>` as the last parameter to both functions
- Make `run_generation_inner` public: `pub async fn run_generation_inner(...)` — needed so focused pipeline tests can call it directly from the integration test crate
- Inside `run_generation_inner`, when `provider_override` is `Some(provider)`:
- Use the provider directly (skip `resolve_provider_and_key` + `create_provider`)
- Use `"mock"` as `provider_name` (for rate limiter key)
- For model resolution: use settings `ai_model` / `ai_model_websearch` directly if non-empty, otherwise fall back to hardcoded defaults like `"mock-model"`. Skip `resolve_model()` which queries `admin_providers`.
- When `None`: current behavior unchanged (production path)
### Modify: `backend/src/handlers/generation.rs`
Pass `None` as `provider_override` in the `tokio::spawn` call. Production behavior unchanged.
---
## 3. Wiremock for HTTP Mocking
### Modify: `backend/Cargo.toml`
Add `wiremock` as dev-dependency:
```toml
[dev-dependencies]
wiremock = "0.6"
```
Integration tests use `wiremock::MockServer` to serve fake HTML pages:
- A **source page** with `<a href>` links to article URLs on the same mock server
- **Article pages** with `<title>`, `<body>` text content
Source URLs and article URLs point to `http://127.0.0.1:{port}/...`.
Note: The SSRF check will reject `127.0.0.1` as a private IP. The wiremock mock server needs to be on a non-private address, OR the SSRF check needs to be bypassed for tests. Options:
- Use the scraper HTTP client which already has a redirect policy but doesn't check the initial request IP for `source_scraper` (it resolves the hostname) — if the hostname is `127.0.0.1`, `check_ssrf` will reject it
- **Simplest approach**: for focused pipeline tests, use the mock LLM provider to return article URLs directly (via search results or link extraction), bypassing source page scraping entirely. For scraping tests, call `scrape_single_article` directly with the wiremock URL (this function doesn't do SSRF checks — those are in `source_scraper`)
---
## 4. Test Scenarios
### Focused pipeline tests (`backend/tests/pipeline_test.rs`)
These call `run_generation_inner` directly with a mock provider. They need a real Postgres DB (for settings, article_history, syntheses tables).
**Test setup helper**:
- Create test DB (reuse existing `TestApp` infrastructure from `tests/common/mod.rs`)
- Create user with settings (categories, max_items, etc.)
- Create a `watch::channel` for progress events
- Build a `MockLlmProvider` with desired configuration
**`phase1_classifies_scraped_articles`**:
- Add sources pointing to wiremock article pages
- Mock LLM classify returns articles in the configured category
- Call `run_generation_inner` with mock provider
- Verify synthesis saved with correct sections and articles
**`phase2_search_fills_gaps`**:
- No sources configured → Phase 1 produces nothing
- Mock LLM search returns structured articles
- Verify synthesis saved with search results
**`all_articles_filtered_returns_error`**:
- Pre-populate `article_history` with hashes of all candidate URLs
- Trigger pipeline → everything filtered → verify error result
**`category_overflow_spills_to_autre`**:
- Set `max_items_per_category=1`, provide multiple articles classified to same category
- Verify overflow articles land in "Autre"
### End-to-end test (`backend/tests/api_generation_test.rs`)
This is harder because the HTTP handler passes `None` for provider_override. Two approaches:
**(a)** Skip the true end-to-end test for now — the focused pipeline tests cover the critical logic. The existing trigger test (`generate_returns_202_with_job_id`) already verifies the HTTP handler wiring.
**(b)** Add a `mock_provider` field to `AppState` (`Option<Arc<dyn LlmProvider>>`) that `run_generation_inner` checks before creating a real provider. This is more invasive but enables full HTTP-path testing.
**Recommendation**: Start with **(a)** — focused pipeline tests are the highest value. The end-to-end HTTP test can be added later if needed.
---
## 5. Files Summary
- **Create:** `backend/src/services/llm/mock.rs` — MockLlmProvider
- **Modify:** `backend/src/services/llm/mod.rs` — register mock module
- **Modify:** `backend/src/services/synthesis.rs` — add `provider_override` parameter, make `run_generation_inner` public
- **Modify:** `backend/src/handlers/generation.rs` — pass `None` for override
- **Create:** `backend/tests/pipeline_test.rs` — focused pipeline tests
- **Modify:** `backend/Cargo.toml` — add wiremock dev-dependency
Loading…
Cancel
Save