You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

16 KiB

QA Guidelines

Test Inventory

Type Count Status Location
Backend unit tests 358 All passing backend/src/**/*.rs (inline #[cfg(test)])
Backend integration tests 183 All passing backend/tests/*.rs
Frontend unit tests 141 131 passing, 10 failing frontend/src/**/*.test.{ts,tsx}
E2E tests (Playwright) 7 All passing e2e/tests/*.spec.ts
Total 689

Release Gate Policy

  • Releases are blocked unless critical flows have deterministic CI coverage.
  • Mandatory deterministic CI coverage includes:
    • Scheduler execution path (due schedule selection, run/skip behavior, last_run_at handling, email side effects).
    • SSE generation progress contract.
  • Tests requiring external providers (for example generation-live.spec.ts) are non-blocking supplemental checks and must not be the only coverage for critical flows.

Backend Unit Test Breakdown

Source file Tests Coverage area
services/scraper.rs 74 SSRF IP checks, soft-404, redirect, HTML parsing
services/synthesis.rs 36 Pipeline logic, schema building, category overflow
services/llm/anthropic.rs 20 Response parsing, error handling
services/prompts.rs 18 Prompt template generation
services/csv.rs 18 CSV parsing, serialization
models/synthesis.rs 16 Model validation, serialization
services/rate_limiter.rs 15 Token bucket, concurrency
services/llm/openai.rs 13 Response parsing, error handling
models/source.rs 12 URL / title validation
models/settings.rs 12 Settings validation, defaults
services/export.rs 12 Markdown / PDF rendering
services/llm/gemini.rs 10 Response parsing, error handling
models/provider.rs 10 Provider / model validation
services/email.rs 9 Email rendering, bypass mode
services/encryption.rs 8 AES-256-GCM encrypt/decrypt
services/source_scraper.rs 8 Link extraction, is_article filter
services/llm/schema.rs 8 JSON schema generation
util/token.rs 8 Token generation, hashing
models/api_key.rs 8 API key validation
middleware/csrf.rs 7 CSRF header check
models/rate_limit.rs 6 Rate limit model validation
config.rs 6 Config parsing
middleware/auth.rs 5 Session extraction
services/llm/factory.rs 5 Provider factory
handlers/admin.rs 4 Admin handler validation

Backend Integration Test Breakdown

File Tests Coverage area
api_sources_test.rs 36 Sources CRUD, validation, CSV, bulk import, max limit
api_admin_test.rs 30 Provider CRUD, rate limits, user management, audit log
api_keys_test.rs 18 API key CRUD, encryption, ownership, test endpoint
api_syntheses_test.rs 17 Synthesis CRUD, pagination, ownership, generation trigger
api_auth_test.rs 16 Register, login, verify, logout, session
api_export_test.rs 13 Email send, Markdown export, PDF export
api_themes_test.rs 10 Theme CRUD, validation, ownership
api_schedules_test.rs 9 Schedule CRUD, validation, ownership
api_settings_test.rs 7 Settings CRUD, defaults, boundary values
pipeline_test.rs 6 Phase 1 extraction, Phase 2 search, overflow, diversity, dedup, preferred
api_article_history_test.rs 4 History list, clear, provenance
api_csrf_test.rs 4 CSRF header enforcement
api_stop_generation_test.rs 4 Stop job, ownership, 404
api_llm_logs_test.rs 3 LLM logs auth, 404, happy path
api_sources_preferred_test.rs 3 Preferred sources set/clear/auth
minimal_test.rs 2 Infrastructure sanity
api_health_test.rs 1 Health check

E2E Test Breakdown

File Coverage area
registration.spec.ts Full magic link registration flow
settings.spec.ts Settings persistence across reloads
settings-export.spec.ts Settings export/import roundtrip
sources.spec.ts Source CRUD + preferred sources via API
themes.spec.ts Theme CRUD + schedule CRUD via API
admin-providers.spec.ts Admin provider management, settings dropdown
generation-live.spec.ts Full pipeline with real OpenAI key (gated on OPENAI_TEST_API_KEY)

Running Tests

Backend Unit Tests

No database required:

cd backend && cargo test --lib

Backend Integration Tests

Requires a running Postgres instance. Use the helper script:

./scripts/run-integration-tests.sh                          # all tests
./scripts/run-integration-tests.sh --test pipeline_test      # one test file
./scripts/run-integration-tests.sh --test api_admin_test config_providers  # one test by name
./scripts/run-integration-tests.sh --lib                     # unit tests only
./scripts/run-integration-tests.sh --db-check                # just check DB connectivity

The script automatically:

  • Starts the test Postgres container on port 5433 (via e2e/docker-compose.test.yml)
  • Sets TEST_DATABASE_URL and SKIP_SSRF_CHECK=1
  • Runs cargo test with the specified arguments

Manual equivalent:

cd e2e && docker compose -f docker-compose.test.yml up -d db
cd ../backend
export TEST_DATABASE_URL=postgres://ai_synth_test:testpassword@127.0.0.1:5433/ai_synth_test
export SKIP_SSRF_CHECK=1
cargo test

Frontend Unit Tests

cd frontend && npx vitest run

Type checking (no tests, just compiler verification):

cd frontend && npx tsc --noEmit

E2E Tests (Playwright)

Use the helper script, which builds the Docker image, starts the full stack, seeds the database, and runs Playwright:

./scripts/run-e2e-tests.sh                     # all E2E tests
./scripts/run-e2e-tests.sh --headed            # with browser visible
./scripts/run-e2e-tests.sh generation-live      # specific test file

The script:

  1. Builds the test Docker image (docker compose -f docker-compose.test.yml build)
  2. Starts the full stack (app + Postgres)
  3. Waits for the app health check to pass
  4. Installs npm dependencies and Playwright browsers
  5. Seeds the test database (npx tsx seed.ts)
  6. Runs Playwright tests
  7. Cleans up on exit (stops containers, removes volumes)

The generation-live.spec.ts test requires OPENAI_TEST_API_KEY to be set (in e2e/.env.test or environment). It is a supplemental non-blocking check and does not replace deterministic CI coverage.


Test Infrastructure

TestApp (Backend Integration Tests)

backend/tests/common/mod.rs provides the TestApp struct, which is the foundation for all integration tests.

What it does:

  • Creates a unique temporary Postgres database per test (named ai_synth_test_{uuid})
  • Runs all migrations
  • Builds the full Axum router with test configuration (bypassed Turnstile and Resend)
  • Provides request helpers: get, post, get_with_session, post_with_session, put_with_session, delete_with_session, raw_request_text, raw_request_bytes
  • Provides auth helpers: create_test_user, create_authenticated_user, create_admin_user, register_user_via_api, create_magic_link_for_email
  • Provides insert_test_synthesis for creating test data without running the pipeline
  • Handles cleanup via Drop (fire-and-forget) or explicit cleanup().await

Request helpers automatically:

  • Set Content-Type: application/json for requests with a body
  • Set X-Requested-With: XMLHttpRequest (CSRF header) for mutating methods (POST, PUT, DELETE, PATCH)
  • Set the session cookie when session_cookie is provided
  • Parse the response body as JSON (or return {} for empty bodies)

Usage pattern:

#[tokio::test]
async fn my_test() {
    let app = TestApp::new().await;
    let (user_id, session) = app.create_authenticated_user("user@test.com").await;

    let (status, body) = app.get_with_session("/api/v1/settings", &session).await;
    assert_eq!(status, StatusCode::OK);
    // ...assertions...

    app.cleanup().await;
}

Wiremock (Pipeline Tests)

Pipeline integration tests use wiremock to mock HTTP responses from source websites. The mock server runs on localhost, which is why SKIP_SSRF_CHECK=1 is required (otherwise the SSRF protection would block requests to localhost).

MockLlmProvider

backend/src/services/llm/mock.rs provides a deterministic mock LLM provider for pipeline tests:

  • Returns classify/summarize responses when the system prompt contains "classer" (French for "classify")
  • Returns search responses with configurable URLs via with_search_urls()
  • Uses a configurable default category via with_default_category()
  • Identifies call types by inspecting French keywords in the system prompt

Usage:

let mock = MockLlmProvider::new()
    .with_default_category("IA")
    .with_search_urls(vec!["https://example.com/article".into()])
    .into_arc();

E2E Seed Data (seed.ts)

e2e/seed.ts creates known test users and sessions in the database. It is idempotent (uses ON CONFLICT DO NOTHING):

  • Admin user: admin@test.local with a known session token
  • Regular user: user@test.local with a known session token
  • Gemini provider: Enabled for the test environment

Session tokens are SHA-256 hashed before insertion (matching the backend's hashing strategy).

E2E Auth Helpers (auth.ts)

e2e/helpers/auth.ts provides:

  • loginAsAdmin(page): Injects the admin session cookie.
  • loginAsUser(page): Injects the regular user session cookie.
  • registerAndVerify(page, email): Full registration flow: calls the API to register, inserts a magic link token directly in the DB, navigates to the verify URL.
  • createDbClient(): Returns a pg.Client connected to the test database.

Writing Integration Tests

Patterns

  1. Each test gets its own TestApp (and therefore its own database). Tests are fully isolated.

  2. Create users via helpers, not via the registration API (unless testing registration):

    let (user_id, session) = app.create_authenticated_user("user@test.com").await;
    
  3. Test all access control paths for every endpoint:

    • 401 without authentication
    • 403 for admin-only endpoints with a regular user
    • 404 for accessing another user's resources (ownership isolation)
  4. Settings payload must be complete. The PUT /settings endpoint requires every field. When sending a settings update in tests, include all fields:

    let settings = serde_json::json!({
        "max_articles_per_source": 3,
        "max_links_per_source": 10,
        "use_brave_search": false,
        "article_history_days": 30,
        "batch_size": 5,
        "source_extraction_window": 5,
        "search_agent_behavior": "",
        "ai_provider": "gemini",
        "ai_model": "gemini-2.5-flash",
        "ai_model_websearch": "gemini-2.5-flash",
        "rate_limit_max_requests": null,
        "rate_limit_time_window_seconds": null
    });
    
  5. Use post_without_csrf to test CSRF rejection.

  6. Use raw_request_text / raw_request_bytes for non-JSON responses (CSV exports, PDF exports).

  7. Always call app.cleanup().await at the end of the test for deterministic cleanup.

Pipeline Tests

Pipeline integration tests in pipeline_test.rs use wiremock + MockLlmProvider:

  1. Set up wiremock to serve a mock source page with article links
  2. Set up wiremock to serve mock article pages
  3. Configure user settings and sources pointing to wiremock URLs
  4. Run the pipeline with MockLlmProvider via the provider_override parameter
  5. Assert the resulting synthesis contains the expected categories and articles

Writing E2E Tests

Playwright Configuration

  • Tests run against the Docker-composed stack on http://localhost:8080
  • Single worker to avoid parallel DB state mutations
  • Timeout: 30 seconds per test, 2 retries
  • Screenshots on failure, traces on first retry
  • Chromium browser only

Patterns

  1. Use loginAsAdmin / loginAsUser from e2e/helpers/auth.ts for authentication:

    import { loginAsUser } from '../helpers/auth';
    
    test('my test', async ({ page }) => {
      await loginAsUser(page);
      await page.goto('/', { waitUntil: 'domcontentloaded' });
      // ...
    });
    
  2. Use waitUntil: 'domcontentloaded' instead of the default load for page.goto(). This avoids waiting for external resources (Turnstile scripts, fonts) that may not load in the test environment.

  3. Prefer API-based setup over UI interactions for test data. Use page.evaluate() to call the API directly:

    await page.evaluate(async () => {
      await fetch('/api/v1/sources', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json', 'X-Requested-With': 'XMLHttpRequest' },
        body: JSON.stringify({ title: 'Test', url: 'https://example.com', theme_id: '...' }),
      });
    });
    
  4. Use createDbClient() from e2e/helpers/auth.ts when you need to verify database state directly.

  5. The generation-live.spec.ts test is gated on OPENAI_TEST_API_KEY. Treat it as supplemental coverage only.


Known Limitations

Drop Deadlock in TestApp

The TestApp::Drop implementation spawns a background thread to drop the test database. Do not call .join() on this thread -- it deadlocks because the spawned thread creates a new tokio runtime whose block_on conflicts with the existing runtime's connection pool. The thread runs independently and cleans up asynchronously. For deterministic cleanup, use app.cleanup().await.

SSRF Bypass for Integration Tests

SKIP_SSRF_CHECK=1 is set during integration tests so that wiremock (running on localhost) is not blocked by the SSRF protection. This env var check runs at runtime, not compile time. Ensure it is never set in production.

Flaky generation-live Test

The generation-live.spec.ts test depends on a real OpenAI API call. It may fail due to:

  • API rate limits
  • Slow responses exceeding the 30-second timeout
  • Changes in model behavior affecting output format

It is configured with 2 retries to mitigate transient failures.

Frontend Failing Tests

As of the last audit, 10 of 141 frontend unit tests are failing. Investigate with cd frontend && npx vitest run before adding new frontend tests.


Coverage Targets and Gaps

Well-Covered Areas

  • SSRF protection: 74 unit tests covering all private IP ranges, IPv4-mapped IPv6, redirect blocking
  • Sources CRUD: 36 integration tests including CSV, bulk import, max limits
  • Admin module: 30 integration tests with access control verification
  • Encryption: Tests verify API keys are not stored in plaintext by querying the database directly
  • Pipeline: Uses wiremock + MockLlmProvider for deterministic end-to-end pipeline testing

Critical Gaps

The following gaps must be addressed to satisfy the release gate policy.

Gap Priority Description
Scheduled execution Critical scheduler.rs has zero tests. Autonomous process that generates syntheses and sends emails.
Brave Search pipeline High Only 1 unit test. The Brave Search code path in the pipeline is untested in integration.
Date filtering High No tests verify that max_age_days actually filters old articles.
Rate limiting integration High 15 unit tests but no integration test verifying rate limits are applied during pipeline runs.
SSE progress stream High No integration test for the SSE endpoint. Only tested in the gated E2E test.
Settings validation (negative) Medium No tests for rejection of out-of-range values (e.g., max_articles_per_source: 0).
Article history ownership Medium No test verifying User B cannot see User A's article history.
Frontend failing tests Medium 10 tests need investigation and fixing.