You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ai_synth/docs/audit/test-coverage-gaps-v2.md

150 lines
6.0 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Test Coverage Gaps — v2 (2026-03-27)
## Summary
| Tier | Count |
|------|-------|
| Unit tests (`cargo test --lib`) | **358** |
| Integration tests (backend/tests/*.rs) | **175** across 15 files |
| E2E tests (e2e/tests/*.spec.ts) | **7** (1 per spec file, some multi-step) |
### Integration test breakdown by file
| File | Tests |
|------|-------|
| api_admin_test.rs | 30 |
| api_auth_test.rs | 16 |
| api_sources_test.rs | 36 |
| api_syntheses_test.rs | 17 |
| api_keys_test.rs | 18 |
| api_export_test.rs | 13 |
| api_themes_test.rs | 10 |
| api_schedules_test.rs | 9 |
| api_settings_test.rs | 7 |
| pipeline_test.rs | 5 |
| api_article_history_test.rs | 4 |
| api_csrf_test.rs | 4 |
| api_sources_preferred_test.rs | 3 |
| api_health_test.rs | 1 |
| minimal_test.rs | 2 |
---
## Gaps Found
### GAP-01 — `stop_generate` endpoint has zero test coverage
**Priority: High**
`POST /api/v1/syntheses/generate/{job_id}/stop` is implemented in
`backend/src/handlers/generation.rs` and registered in `router.rs`, but there is
no integration test (no call to this route appears anywhere in `backend/tests/`),
and no E2E test exercises the stop/cancel flow.
**What to add:**
- Integration test in `api_syntheses_test.rs`:
- `stop_generate_without_auth_returns_401`
- `stop_generate_unknown_job_returns_404`
- `stop_generate_owned_job_returns_200` (trigger generation, then immediately stop it)
- `stop_generate_other_users_job_returns_404`
---
### GAP-02 — `GET /api/v1/llm-logs/{job_id}` has zero integration test coverage
**Priority: Medium**
The handler exists (`backend/src/handlers/llm_logs.rs`) and the route is registered
(`router.rs:71`). It is exercised only by the live E2E test
(`generation-live.spec.ts`), which is gated on `OPENAI_TEST_API_KEY` and therefore
does not run in CI.
**What to add:**
- New `backend/tests/api_llm_logs_test.rs`:
- `get_llm_logs_without_auth_returns_401`
- `get_llm_logs_unknown_job_returns_empty_array` (or 404 — clarify contract)
- `get_llm_logs_returns_entries_for_known_job` (requires seeding a job_id in `llm_call_log`)
---
### GAP-03 — `is_preferred` ordering not covered in pipeline tests
**Priority: Medium**
`backend/src/services/synthesis.rs` implements preferred-first URL ordering
(lines 320422). `api_sources_preferred_test.rs` verifies the CRUD side but
neither `pipeline_test.rs` nor any other test asserts that preferred sources
are actually processed before non-preferred ones during a generation run.
**What to add:**
- Pipeline test: construct sources with mixed `is_preferred` values, run the
pipeline with the mock provider, and assert preferred-source URLs appear
before non-preferred ones in the scrape wave.
---
### GAP-04 — `source_diversity_window` feature is unimplemented (plans only)
**Priority: Low / Tracking**
Migration `20260323000013_add_source_diversity_window.sql` exists, but the
corresponding Rust field is absent from `backend/src/models/settings.rs` and
the pipeline does not yet use it. No tests exist because there is nothing to
test yet.
**What to add (when feature is implemented):**
- Settings round-trip test: store and retrieve `source_diversity_window`
- Pipeline test: verify that domains from recent syntheses are injected into
the search prompt when `source_diversity_window > 0`
- Settings boundary test: `source_diversity_window = 0` disables the feature
---
### GAP-05 — Settings tests do not cover `max_articles_per_source` boundary enforcement
**Priority: Low**
`api_settings_test.rs` includes `put_settings_boundary_values_succeed` but does
not assert that values outside `[1, 10]` are rejected with 422. The validation
logic exists in `models/settings.rs:5354`.
**What to add:**
- `put_settings_max_articles_per_source_zero_returns_422`
- `put_settings_max_articles_per_source_eleven_returns_422`
---
### GAP-06 — E2E tests use `test(` count of 1 per file; multi-scenario coverage is thin
**Priority: Low**
Every E2E spec registers exactly one Playwright `test()` (several use
`test.describe` internally). The `generation-live.spec.ts` test is gated on
`OPENAI_TEST_API_KEY` and does not run in normal CI. The remaining six specs
cover: registration, settings/export, sources, themes (including schedules and
preferred), admin providers. The stop-generation and llm-logs flows have no
E2E counterpart.
**What to add:**
- Stop-generation E2E scenario inside `generation-live.spec.ts` (trigger then
cancel before completion; assert SSE emits a cancelled/error event)
---
## Coverage by Feature
| Feature | Unit | Integration | E2E | Notes |
|---------|------|-------------|-----|-------|
| Authentication (register / login / verify / logout) | — | 16 tests | 1 spec | Full coverage |
| CSRF middleware | — | 4 tests | — | Good |
| Settings CRUD | — | 7 tests | 1 spec | Missing out-of-range rejection tests |
| Sources CRUD + bulk import + CSV | — | 36 tests | 1 spec | Strong |
| Preferred sources (CRUD) | — | 3 tests | 1 spec (shared) | CRUD covered; pipeline ordering not tested |
| Themes CRUD | — | 10 tests | 1 spec | Good |
| Schedules CRUD | — | 9 tests (in api_themes_test.rs) | 1 spec (shared) | Good |
| API keys (CRUD + encrypt/decrypt) | — | 18 tests | — | Good |
| Admin (providers / rate-limits / users / audit) | — | 30 tests | 1 spec | Good |
| Syntheses (CRUD + generation trigger) | — | 17 tests | 1 spec (live) | Good |
| Stop generation | — | **0 tests** | **0 tests** | **GAP-01** |
| Export (email / PDF / Markdown) | — | 13 tests | 1 spec | Good |
| LLM logs | — | **0 tests** | live only (gated) | **GAP-02** |
| Article history + provenance (CRUD) | — | 4 tests | live only (gated) | Thin; provenance success path missing |
| Pipeline (heuristic / search / overflow / diversity / dedup) | 358 unit | 5 integration | live only (gated) | Preferred ordering not tested (GAP-03) |
| Source diversity via history | — | **0 tests** | **0 tests** | Feature not yet implemented (GAP-04) |
| `max_articles_per_source` validation | — | partial | — | GAP-05 |
| Health check | — | 1 test | — | OK |