6 Commits (102f6d3fe76c5681312f3e45cfb86be414f1a13e)

Author SHA1 Message Date
oabrivard e24236a069 feat: add max_links_per_source setting (default 8, was hardcoded 15)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard d234fa9b24 feat: add is_article LLM check + remove use_llm_for_source_links option
The LLM now determines if scraped content is a real article during
classify (zero extra cost). The separate LLM link extraction option
is removed — heuristic extraction is sufficient.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 0f1b0306e4 feat: add source_extraction_window setting (default 3)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 1b63afd12a feat: add summary_length setting (1=court, 2=moyen, 3=detaille)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 0874650a7f fix: pipeline tests use wiremock URLs + skip SSRF for localhost
- Add SKIP_SSRF_CHECK env var to bypass SSRF in test environments
- Use wiremock server as source URL (same domain as article URLs)
- Add source page mock to wiremock setup
- Set SKIP_SSRF_CHECK=1 in integration test script
- Fix unused import warning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 370e033506 feat: add pipeline integration tests with MockLlmProvider and wiremock
Add three integration tests that exercise the synthesis generation
pipeline end-to-end using MockLlmProvider and wiremock for HTTP mocking:
- phase1_with_llm_link_extraction_classifies_articles
- phase2_search_fills_gaps_when_no_sources
- category_overflow_spills_to_autre

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago