docs: add spec for integration tests with mock LLM provider + update algorithm.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>master
parent
0b71215ddc
commit
f678275f59
@ -0,0 +1,132 @@
|
|||||||
|
# Design: Integration Tests with Mock LLM Provider
|
||||||
|
|
||||||
|
**Date**: 2026-03-26
|
||||||
|
**Scope**: Add integration tests for the generation pipeline using a mock LLM provider and wiremock for HTTP mocking
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The generation pipeline (`run_generation_inner`) has no integration tests. Existing tests only verify HTTP handler responses (202 Accepted) and model resolution, but never exercise the actual pipeline logic (scraping, filtering, classification, category assignment, saving).
|
||||||
|
|
||||||
|
The pipeline creates its LLM provider internally via `create_provider()`, making it impossible to inject a mock. This needs a small refactoring to enable dependency injection.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. MockLlmProvider
|
||||||
|
|
||||||
|
### New file: `backend/src/services/llm/mock.rs`
|
||||||
|
|
||||||
|
A `MockLlmProvider` implementing `LlmProvider` that returns canned JSON responses:
|
||||||
|
|
||||||
|
- **Classify calls** (detected by system prompt containing "classer"): returns `{title: "...", summary: "...", category: "<configured category>"}` using the article title from the user prompt
|
||||||
|
- **Search calls** (detected by system prompt containing "precis"): returns `{category_0: [{title, url, summary}]}` with configurable test URLs
|
||||||
|
- **Link extraction calls** (detected by system prompt containing "liens"): returns `{urls: [...]}` with configurable URLs
|
||||||
|
|
||||||
|
The mock is configurable via constructor:
|
||||||
|
```rust
|
||||||
|
MockLlmProvider::new()
|
||||||
|
.with_default_category("Test Category")
|
||||||
|
.with_search_urls(vec!["http://mock-server/article-1", ...])
|
||||||
|
```
|
||||||
|
|
||||||
|
Registered in `llm/mod.rs` as `pub mod mock;` (always available, not `#[cfg(test)]`, since integration tests are in a separate crate).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Dependency Injection Refactoring
|
||||||
|
|
||||||
|
### Modify: `backend/src/services/synthesis.rs`
|
||||||
|
|
||||||
|
The actual signatures are:
|
||||||
|
- `pub async fn run_generation(job_id: Uuid, state: AppState, user_id: Uuid, tx: Arc<watch::Sender<ProgressEvent>>)`
|
||||||
|
- `async fn run_generation_inner(job_id: Uuid, state: &AppState, user_id: Uuid, tx: &watch::Sender<ProgressEvent>) -> Result<Uuid, AppError>`
|
||||||
|
|
||||||
|
Changes:
|
||||||
|
- Add `provider_override: Option<Arc<dyn LlmProvider>>` as the last parameter to both functions
|
||||||
|
- Make `run_generation_inner` public: `pub async fn run_generation_inner(...)` — needed so focused pipeline tests can call it directly from the integration test crate
|
||||||
|
- Inside `run_generation_inner`, when `provider_override` is `Some(provider)`:
|
||||||
|
- Use the provider directly (skip `resolve_provider_and_key` + `create_provider`)
|
||||||
|
- Use `"mock"` as `provider_name` (for rate limiter key)
|
||||||
|
- For model resolution: use settings `ai_model` / `ai_model_websearch` directly if non-empty, otherwise fall back to hardcoded defaults like `"mock-model"`. Skip `resolve_model()` which queries `admin_providers`.
|
||||||
|
- When `None`: current behavior unchanged (production path)
|
||||||
|
|
||||||
|
### Modify: `backend/src/handlers/generation.rs`
|
||||||
|
|
||||||
|
Pass `None` as `provider_override` in the `tokio::spawn` call. Production behavior unchanged.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Wiremock for HTTP Mocking
|
||||||
|
|
||||||
|
### Modify: `backend/Cargo.toml`
|
||||||
|
|
||||||
|
Add `wiremock` as dev-dependency:
|
||||||
|
```toml
|
||||||
|
[dev-dependencies]
|
||||||
|
wiremock = "0.6"
|
||||||
|
```
|
||||||
|
|
||||||
|
Integration tests use `wiremock::MockServer` to serve fake HTML pages:
|
||||||
|
- A **source page** with `<a href>` links to article URLs on the same mock server
|
||||||
|
- **Article pages** with `<title>`, `<body>` text content
|
||||||
|
|
||||||
|
Source URLs and article URLs point to `http://127.0.0.1:{port}/...`.
|
||||||
|
|
||||||
|
Note: The SSRF check will reject `127.0.0.1` as a private IP. The wiremock mock server needs to be on a non-private address, OR the SSRF check needs to be bypassed for tests. Options:
|
||||||
|
- Use the scraper HTTP client which already has a redirect policy but doesn't check the initial request IP for `source_scraper` (it resolves the hostname) — if the hostname is `127.0.0.1`, `check_ssrf` will reject it
|
||||||
|
- **Simplest approach**: for focused pipeline tests, use the mock LLM provider to return article URLs directly (via search results or link extraction), bypassing source page scraping entirely. For scraping tests, call `scrape_single_article` directly with the wiremock URL (this function doesn't do SSRF checks — those are in `source_scraper`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Test Scenarios
|
||||||
|
|
||||||
|
### Focused pipeline tests (`backend/tests/pipeline_test.rs`)
|
||||||
|
|
||||||
|
These call `run_generation_inner` directly with a mock provider. They need a real Postgres DB (for settings, article_history, syntheses tables).
|
||||||
|
|
||||||
|
**Test setup helper**:
|
||||||
|
- Create test DB (reuse existing `TestApp` infrastructure from `tests/common/mod.rs`)
|
||||||
|
- Create user with settings (categories, max_items, etc.)
|
||||||
|
- Create a `watch::channel` for progress events
|
||||||
|
- Build a `MockLlmProvider` with desired configuration
|
||||||
|
|
||||||
|
**`phase1_classifies_scraped_articles`**:
|
||||||
|
- Add sources pointing to wiremock article pages
|
||||||
|
- Mock LLM classify returns articles in the configured category
|
||||||
|
- Call `run_generation_inner` with mock provider
|
||||||
|
- Verify synthesis saved with correct sections and articles
|
||||||
|
|
||||||
|
**`phase2_search_fills_gaps`**:
|
||||||
|
- No sources configured → Phase 1 produces nothing
|
||||||
|
- Mock LLM search returns structured articles
|
||||||
|
- Verify synthesis saved with search results
|
||||||
|
|
||||||
|
**`all_articles_filtered_returns_error`**:
|
||||||
|
- Pre-populate `article_history` with hashes of all candidate URLs
|
||||||
|
- Trigger pipeline → everything filtered → verify error result
|
||||||
|
|
||||||
|
**`category_overflow_spills_to_autre`**:
|
||||||
|
- Set `max_items_per_category=1`, provide multiple articles classified to same category
|
||||||
|
- Verify overflow articles land in "Autre"
|
||||||
|
|
||||||
|
### End-to-end test (`backend/tests/api_generation_test.rs`)
|
||||||
|
|
||||||
|
This is harder because the HTTP handler passes `None` for provider_override. Two approaches:
|
||||||
|
|
||||||
|
**(a)** Skip the true end-to-end test for now — the focused pipeline tests cover the critical logic. The existing trigger test (`generate_returns_202_with_job_id`) already verifies the HTTP handler wiring.
|
||||||
|
|
||||||
|
**(b)** Add a `mock_provider` field to `AppState` (`Option<Arc<dyn LlmProvider>>`) that `run_generation_inner` checks before creating a real provider. This is more invasive but enables full HTTP-path testing.
|
||||||
|
|
||||||
|
**Recommendation**: Start with **(a)** — focused pipeline tests are the highest value. The end-to-end HTTP test can be added later if needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Files Summary
|
||||||
|
|
||||||
|
- **Create:** `backend/src/services/llm/mock.rs` — MockLlmProvider
|
||||||
|
- **Modify:** `backend/src/services/llm/mod.rs` — register mock module
|
||||||
|
- **Modify:** `backend/src/services/synthesis.rs` — add `provider_override` parameter, make `run_generation_inner` public
|
||||||
|
- **Modify:** `backend/src/handlers/generation.rs` — pass `None` for override
|
||||||
|
- **Create:** `backend/tests/pipeline_test.rs` — focused pipeline tests
|
||||||
|
- **Modify:** `backend/Cargo.toml` — add wiremock dev-dependency
|
||||||
Loading…
Reference in New Issue