148 Commits (fa9375233eacbd87cbf8e095c8b9a8b39287c5eb)
 

Author SHA1 Message Date
oabrivard fa9375233e fix: remove 3 compiler warnings (unreachable code, unused variables) 3 months ago
oabrivard 14b0a0b7e8 refactor: LLM link extraction uses body only (no head), increased to 12000 chars 3 months ago
oabrivard 3353e5261f feat: rate limiter waits instead of failing — sleeps until window passes (max 60s) 3 months ago
oabrivard ed399e9a6e feat: parallelize Phase 1 scrape+classify in batches of 5 3 months ago
oabrivard a5f4239157 fix: distinguish filtered_too_old from filtered_empty in article tracing 3 months ago
oabrivard a760220d44 fix: log LLM calls for source link extraction in llm_call_log 3 months ago
oabrivard fb765d6c8f feat: split model dropdowns — scraping vs websearch in frontend
Replace the single `models` array in `ProviderConfig` and `AdminProvider`
with separate `models_scraping` / `models_websearch` lists. Rename
`ai_model_writing` → `ai_model_websearch` in `UserSettings` and all
references (Settings page, admin Providers page, E2E test, fixtures,
and unit tests). Update i18n label for the second dropdown to
"Modele d'IA (Recherche Web)".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 8d232c1ade feat: split model selection — scraping vs websearch with GPT-5 models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 97e484e03f docs: add model split implementation plan 3 months ago
oabrivard c321cacd78 docs: update OpenAI model lists to GPT-5 generation 3 months ago
oabrivard 03d4cfb773 docs: add spec for model split — scraping vs websearch 3 months ago
oabrivard 37bc849f92 feat: add clear history button with confirmation on ArticleHistory page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7f7d584314 feat: parallel source extraction, shuffle candidates, clear history endpoint
- Remove 10-source cap; all sources are now processed
- Increase max links per source from 10 to 15
- Extract article links in parallel (up to 5 concurrent) using JoinSet
- Shuffle candidate URLs after history filtering to interleave sources
- Add DELETE /api/v1/article-history endpoint to clear all history for a user

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 2d623c6ced docs: add spec for pipeline tweaks (parallel extraction, shuffle, clear history) 3 months ago
oabrivard 48957470ed test: update E2E test for new pipeline (remove deprecated settings) 3 months ago
oabrivard c4a4cd9987 feat: remove deprecated settings from frontend
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 7a8427316c feat: rewrite synthesis pipeline — per-article classify/summarize, no rewrite pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 0b180eb75c refactor: remove old classification, rewrite, and article extraction prompts/schemas
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard bb716b5dc2 feat: add get_last_source_url + remove head_html from ScrapedContent
- Add get_last_source_url() to article_history db module for source rotation
- Remove head_html field from ScrapedContent struct and scrape_url function
- Fix synthesis.rs scrape_single_article_with_llm to pass empty string instead of removed field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard b2dbc3847a feat: add per-article classify/summarize prompt and schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 825b793387 feat: drop source_diversity_window and use_llm_for_article_extraction settings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard d3b63295f6 docs: add algorithm rewrite implementation plan (7 tasks) 3 months ago
oabrivard 1d5dc0596c docs: add spec for algorithm rewrite — per-article classify, no rewrite pass 3 months ago
oabrivard a2fe3f3310 feat: simplify LlmProvider trait to single call_llm method
Replace the three-method LlmProvider trait (generate_search_pass,
generate_rewrite_pass, supports_web_search) and ProviderCapabilities
with a single call_llm method. Update all three provider implementations
(Gemini, OpenAI, Anthropic) and all callers in synthesis.rs,
source_scraper.rs, and api_keys.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard e4b76fb06a docs: add plan for simplifying LLM provider trait 3 months ago
oabrivard cb0b7620c9 docs: add spec for simplifying LLM provider trait to single call_llm method 3 months ago
oabrivard 0372a96822 docs: add algorithm.md describing full synthesis generation pipeline 3 months ago
oabrivard d9982b467c test: verify LLM call logs endpoint returns data after generation 3 months ago
oabrivard f9023cff7e feat: LLM logs viewer page + log button on Home synthesis list
- Add LlmLogs page with collapsible prompts/response sections, call-type
  colored badges, and duration display
- Wire /llm-logs/:jobId route in App.tsx (lazy-loaded)
- Expose job_id in backend SynthesisListItem and frontend SynthesisListItem
  type; update test fixture accordingly
- Add log-icon link next to delete button on each Home synthesis card

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard cbe1cd6507 feat: LLM logs types, API client, and i18n labels
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard dafec2591b feat: API endpoint for LLM call logs by job_id
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 9fffde8312 feat: log LLM calls with timing at search, classification, and rewrite steps
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard b2b0b286c0 feat: create llm_call_log table + DB module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 88c16c5d67 docs: add LLM call logging implementation plan (6 tasks) 3 months ago
oabrivard 314fb7a037 docs: add spec for LLM call logging per synthesis 3 months ago
oabrivard f7428191ec test: verify provenance endpoint returns tracing data after generation 3 months ago
oabrivard 6fc6fff1f3 feat: article history page + provenance section in synthesis detail
- Add ArticleHistoryEntry/ArticleHistoryResponse types
- Add articleHistoryApi client (list + getProvenance endpoints)
- Add ArticleHistory page with status/source_type filters and pagination
- Add collapsible provenance section to SynthesisDetail
- Register /article-history route in App.tsx
- Add viewHistory link in Settings near articleHistoryDays input
- Add all French i18n strings for article history feature

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 55fe828e58 feat: API endpoints for article history listing and provenance
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard b9003cde54 feat: instrument pipeline with article tracing at every filtering step
Add source_url field to ScrapedNewsItem and a trace_article helper that
inserts into article_history with full provenance metadata.  Instrument
Phase 1 (empty content, history dedup, source diversity) and Phase 2
(homepage filter, cross-phase dedup, history dedup, empty content) so
every dropped article is recorded with its filter reason.  Replace the
old insert_urls call with per-article trace_article calls for used
articles, preserving dedup semantics via url_hash.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 0e2c69edf7 feat: save job_id on syntheses for provenance lookup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard eba721266f feat: article history entry struct + insert/query/cleanup functions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard d7afd08eaf feat: enrich article_history with tracing metadata + syntheses.job_id 3 months ago
oabrivard 5a0495b02a docs: add article tracing implementation plan (7 tasks) 3 months ago
oabrivard 445dad9963 docs: add spec for article tracing — enriched history with provenance views 3 months ago
oabrivard 7cbb2853ce feat: Autre fill-up to 75% synthesis target with source diversity enforcement
Accumulates overflow articles from both classification phases and redistributes
them into the Autre category when total articles fall below 75% of the configured
max, respecting per-source diversity limits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard c3e6103ef1 feat: parse_classification_response collects overflow articles
Returns a (result, overflow) tuple so callers can access articles that
could not fit in any category or Autre. Also adds the
SYNTHESIS_MIN_FILL_RATIO constant for the upcoming fill-up logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard f5f0656604 docs: add Autre fill-up implementation plan 3 months ago
oabrivard aebd436a91 docs: add spec for Autre fill-up to 75% synthesis target 3 months ago
oabrivard cea723f7d7 test: update E2E and integration tests with article_history_days setting 3 months ago
oabrivard 708a641223 feat: add article_history_days setting to frontend
Add article_history_days (defaulting to 90) to UserSettings interface and DEFAULT_SETTINGS, French translation, and Settings page number input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago