31 Commits (2c3c6008a39dd7836eb85d155d5487f1938a4e34)

Author SHA1 Message Date
oabrivard d234fa9b24 feat: add is_article LLM check + remove use_llm_for_source_links option
The LLM now determines if scraped content is a real article during
classify (zero extra cost). The separate LLM link extraction option
is removed — heuristic extraction is sufficient.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 5b67ef2e51 fix: update integration and E2E test fixtures for summary_length, source_extraction_window, and NewsItem.date
Add missing summary_length and source_extraction_window fields to all
settings JSON payloads in api_settings_test.rs. The pipeline_test.rs,
generation-live.spec.ts, and api_syntheses_test.rs already had correct
fixtures or use JSON literals that are unaffected by the optional date field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 0f1b0306e4 feat: add source_extraction_window setting (default 3)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 1b63afd12a feat: add summary_length setting (1=court, 2=moyen, 3=detaille)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 0874650a7f fix: pipeline tests use wiremock URLs + skip SSRF for localhost
- Add SKIP_SSRF_CHECK env var to bypass SSRF in test environments
- Use wiremock server as source URL (same domain as article URLs)
- Add source page mock to wiremock setup
- Set SKIP_SSRF_CHECK=1 in integration test script
- Fix unused import warning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard a158f14311 fix: don't poll SSE stream in model resolution test
The SSE stream blocks until the generation completes or times out
(15 min). With a fake API key, the LLM call hangs for 120s before
failing. Just verify the 202 trigger succeeded — that confirms
model resolution and provider creation worked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard eadfbc000b fix: expect 201 Created for source creation in syntheses test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 54c637647b fix: add missing fields to syntheses test settings payload
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7ec491b6ac fix: update settings test payloads for new required fields + fix unused var warning
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard a0a8f72caa fix: don't join Drop cleanup thread — prevents deadlock in tokio tests
The Drop impl spawned a thread with a new tokio runtime and called
.join(), which blocked the test thread. The spawned thread's block_on
deadlocked when pool.close() tried to communicate with connections
owned by the outer tokio runtime. Removing .join() makes cleanup
fire-and-forget, avoiding the deadlock.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 243558b950 debug: add full_app_health_check test with logging 3 months ago
oabrivard aee70b37d4 fix: use docker-compose.test.yml for integration test DB
Rewrite run-integration-tests.sh to use the e2e docker-compose config
(Postgres on port 5433). Add --db-check flag for connectivity debugging.
Remove build_test_router (reverted to build_router). Keep minimal_test
for oneshot debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 5fa060fadc fix: use invalid session token for admin auth rejection tests
Same fix as other test files — avoids oneshot() hang with no cookie.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard bd4f101d16 fix: use invalid session token instead of no cookie for auth rejection tests
Unauthenticated requests (no Cookie header) hang with oneshot() in tests.
Using an invalid session token achieves the same 401 result without hanging.
3 months ago
oabrivard 53813007c6 fix: use lightweight test router without SPA fallback and TraceLayer
Unauthenticated requests were hanging in integration tests due to
tower middleware layers interacting with oneshot(). Add build_test_router()
that only includes API routes + CSRF middleware.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7cbafdfb31 fix: create test static dir to prevent ServeDir/ServeFile hang
The SPA fallback uses ServeDir/ServeFile which can hang when the
directory doesn't exist. Create it in TestApp::new() with a minimal
index.html.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard bb2209e425 fix: update admin tests for models_scraping/models_websearch split
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 370e033506 feat: add pipeline integration tests with MockLlmProvider and wiremock
Add three integration tests that exercise the synthesis generation
pipeline end-to-end using MockLlmProvider and wiremock for HTTP mocking:
- phase1_with_llm_link_extraction_classifies_articles
- phase2_search_fills_gaps_when_no_sources
- category_overflow_spills_to_autre

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard ecf95ffe35 fix: update test config for session_secret removal and Arc master key
Remove session_secret field (no longer in AppConfig), wrap
master_encryption_key in Arc<String>, and pass a generated job_id
to db::syntheses::create which now requires it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard f414ff0f58 feat: add use_brave_search setting
Add use_brave_search boolean field to all settings structs, DB layer,
SQL queries, frontend types, i18n labels, and test fixtures following
the same pattern as use_llm_for_source_links.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 4c6381b09a feat: add batch_size setting for Phase 1 parallelism
Add a user-configurable batch_size setting (default 5, range 1-20)
that controls how many articles are processed in parallel during
Phase 1 scrape+classify. Previously hardcoded to 5.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 8d232c1ade feat: split model selection — scraping vs websearch with GPT-5 models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard cea723f7d7 test: update E2E and integration tests with article_history_days setting 3 months ago
oabrivard 8e06357b47 test: update integration test with LLM scraping settings 3 months ago
oabrivard 004f08f385 fix: runtime bugs found during first Docker run + integration tests
Bugs fixed:
- resolve_model queried non-existent admin_provider_models table (use JSONB query on admin_providers)
- key_prefix VARCHAR(10) too short for 11-char prefix (migration to VARCHAR(12))
- API key test schema missing additionalProperties: false (OpenAI strict mode)
- CSP missing font-src data: directive (PDF font embedding blocked)
- Magic link URL not logged in test mode (can't verify without real email)
- Rust 1.85 Docker image too old for dependencies (bumped to 1.88)

Tests added to prevent recurrence:
- schema_meets_openai_strict_mode_requirements: validates additionalProperties on all objects
- key_prefix_full_length_stored_in_db: verifies 11-char prefix survives DB round-trip
- generate_pipeline_resolves_model_from_admin_config: exercises full generation pipeline

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 1f9f7f39d7 Phase 7: Email sending via Resend + Markdown/PDF export
Backend:
- Synthesis email sending via Resend API with HTML template (inline CSS,
  tables-based for email client compatibility) + plain-text fallback
- XSS prevention via html_escape() on all user content in email templates
- Markdown export: clean format with headers, links, summaries
- PDF export: printpdf with built-in Helvetica fonts, indigo color scheme,
  automatic page breaks, word wrapping
- 3 new endpoints: send-email, export/markdown, export/pdf
- All endpoints enforce ownership checks
- Email validation using email_address crate
- 24 new unit tests, 13 integration tests

Frontend:
- Email section on SynthesisDetail: input pre-filled with user email,
  send button with loading state, success/error feedback
- Export buttons: Markdown + PDF with per-button loading states
- File download via Blob + temporary anchor with Content-Disposition parsing
- 6 new export tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard aa6f1ba76b Phase 5: Generation pipeline with SSE progress, syntheses CRUD
Backend:
- Full 2-pass generation pipeline: LLM search -> URL scraping -> LLM rewrite
- Async generation with tokio::spawn, JobStore with per-user concurrency limit
- SSE progress streaming via axum::response::Sse + tokio::sync::watch
- Syntheses CRUD: list (paginated), get (ownership check), delete
- Prompt construction ported from original geminiService.ts
- Parallel URL scraping with bounded concurrency (max 10)
- Graceful partial failure handling (some URLs fail -> continue)
- 36 new unit tests, 16 integration tests

Frontend:
- Home dashboard: synthesis card grid, week badges, delete with confirmation
- Generate page: SSE-driven progress bar, step checklist, auto-redirect
- Synthesis detail: section-by-section display, external links, delete
- SSE client helper with auto-reconnect (exponential backoff)
- Date utilities with French locale formatting

Critical fixes applied:
- SSE EventSource now sends credentials (withCredentials: true)
- Gemini error logging sanitized to prevent API key leak in logs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 439e547367 Phase 4: LLM provider abstraction with Gemini, user API key encryption
Backend:
- LlmProvider async trait with generate_search_pass/generate_rewrite_pass
- GeminiProvider: googleSearch grounding (pass 1), structured JSON output (pass 2)
- AES-256-GCM encryption for user API keys at rest (per-key random nonces)
- MasterKey with zeroize-on-drop (no Clone to prevent unzeroized copies)
- User API key endpoints: list (prefix only), create/update, delete, test
- Dynamic category schema builder for structured LLM output
- Provider factory (Gemini implemented, OpenAI/Anthropic stubbed for Phase 6)
- 37 new unit tests (encryption, schema, Gemini serialization, factory)
- 17 integration tests (CRUD, encryption verification, ownership isolation)

Frontend:
- ApiKeyManager component: per-provider key management in Settings
- Password input with show/hide toggle, key prefix display (monospace)
- Test button validates key with minimal LLM call
- Status badges (configured/not configured)
- 11 new tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 5abbf9b9ad Phase 3: Admin module with provider/model curation, rate limits, user management
Backend:
- Admin API: CRUD for providers, rate limits, user role management
- Public config endpoint for enabled providers/models
- AdminUser extractor enforces RBAC on all admin endpoints
- Per-provider rate limiter with hot-reload from DB
- Audit logging for all admin mutations
- Seed data: Gemini, OpenAI, Anthropic providers with default models
- Self-demotion prevention on role changes
- 30 integration tests, 27 new unit tests

Frontend:
- Admin layout with sidebar navigation (providers, rate limits, users)
- Provider management: enable/disable, model CRUD, default model selection
- Rate limit configuration with effective rate display
- User management with role badges and promote/demote
- Admin link in navbar/mobile menu (visible only to admins)
- Settings page: dynamic provider/model selection from admin config
- 10 new tests (admin guard, config API)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 2b75dc7049 Finished phase 2 3 months ago
oabrivard 355dbf6a5a Finished phase 1 3 months ago