270 Commits (7d58b6e019277eb6714ef350792cd72f90dfdbc5)
 

Author SHA1 Message Date
oabrivard 347558a278 fix: atomic job creation, 15min timeout, and panic handling
- Replace iterating DashMap check with atomic DashSet insert in create_job to
  eliminate the race condition where double-click could create two concurrent
  jobs for the same user
- Add release_user method called at end of generation task (normal, timeout,
  and panic paths) so the generating slot is always freed
- Wrap run_generation in tokio::time::timeout(900s) to prevent hung LLM calls
  from blocking the generation slot forever
- Spawn a second task to await the JoinHandle and call release_user + send
  error event if the generation task panics, preventing SSE clients from
  hanging indefinitely
- Update cleanup_expired to also remove users from generating_users set
- Update tests to call release_user after completion/error to match new contract

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 59932589cc fix: prevent UTF-8 panic in error message truncation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 4123ae38f7 docs: add implementation plan for audit bug fixes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 114f2912dd docs: add spec for audit bug fixes (P0 + P1)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 811c5c87d1 docs: add comprehensive code audit reports from 5-agent team
Architect, Frontend Expert, Rust Expert, Tech Lead, and Devil's Advocate
analyzed the codebase independently. Key findings: UTF-8 panic bug, XSS
risk, synthesis.rs duplication (3x batch loop), Settings.tsx complexity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard e74a1850bf fix: log source URL in link_extraction LLM call logs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard a968fdc308 fix: allow brave_search as valid API key provider
Split VALID_PROVIDERS (LLM only) from VALID_API_KEY_PROVIDERS (includes
brave_search) so Brave keys can be stored without allowing brave_search
as an admin LLM provider.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard c5d23ecd10 feat: add Brave Search section to Settings page
Adds a dedicated Brave Search section after the Advanced Extraction section,
including inline API key management (save/test/delete) and a use_brave_search
toggle that auto-disables when the key is removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard f124b056fe feat: add Brave Search Phase 2 pipeline path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard e05c2ae75a feat: handle brave_search in API key test endpoint
Add a branch in test_key to route brave_search provider to
crate::services::brave_search::test_api_key instead of the LLM factory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard f414ff0f58 feat: add use_brave_search setting
Add use_brave_search boolean field to all settings structs, DB layer,
SQL queries, frontend types, i18n labels, and test fixtures following
the same pattern as use_llm_for_source_links.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard fa03c60339 feat: add Brave Search API client module
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 37bb6b4361 docs: add implementation plan for Brave Search integration
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 83c0828392 docs: add spec for Brave Search API integration in Phase 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 41109b3d93 feat: send structured link pairs to LLM instead of raw HTML body
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard a5332f0996 feat: add article_url to LLM call logs for classify tracing
Adds an optional article_url column to llm_call_log so classify_summarize
entries are traceable back to their source article in the LLM Logs UI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard b062e81218 fix: remove personalized sources from Phase 2 web search prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 2299790986 docs: add implementation plan for pipeline improvements
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7804235401 docs: add spec for pipeline improvements (web search, logs, link extraction)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 4c6381b09a feat: add batch_size setting for Phase 1 parallelism
Add a user-configurable batch_size setting (default 5, range 1-20)
that controls how many articles are processed in parallel during
Phase 1 scrape+classify. Previously hardcoded to 5.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard a5e6cf2ac0 docs: update algorithm docs and generation time estimate
Update algorithm.md to reflect the rewritten per-article classify/summarize
pipeline (no batch classification, no rewrite pass). Update generation time
estimate from 1 minute to 10 minutes in frontend i18n and docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7cd867c650 fix: resolve all clippy warnings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard fa9375233e fix: remove 3 compiler warnings (unreachable code, unused variables) 3 months ago
oabrivard 14b0a0b7e8 refactor: LLM link extraction uses body only (no head), increased to 12000 chars 3 months ago
oabrivard 3353e5261f feat: rate limiter waits instead of failing — sleeps until window passes (max 60s) 3 months ago
oabrivard ed399e9a6e feat: parallelize Phase 1 scrape+classify in batches of 5 3 months ago
oabrivard a5f4239157 fix: distinguish filtered_too_old from filtered_empty in article tracing 3 months ago
oabrivard a760220d44 fix: log LLM calls for source link extraction in llm_call_log 3 months ago
oabrivard fb765d6c8f feat: split model dropdowns — scraping vs websearch in frontend
Replace the single `models` array in `ProviderConfig` and `AdminProvider`
with separate `models_scraping` / `models_websearch` lists. Rename
`ai_model_writing` → `ai_model_websearch` in `UserSettings` and all
references (Settings page, admin Providers page, E2E test, fixtures,
and unit tests). Update i18n label for the second dropdown to
"Modele d'IA (Recherche Web)".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 8d232c1ade feat: split model selection — scraping vs websearch with GPT-5 models
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 97e484e03f docs: add model split implementation plan 3 months ago
oabrivard c321cacd78 docs: update OpenAI model lists to GPT-5 generation 3 months ago
oabrivard 03d4cfb773 docs: add spec for model split — scraping vs websearch 3 months ago
oabrivard 37bc849f92 feat: add clear history button with confirmation on ArticleHistory page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 7f7d584314 feat: parallel source extraction, shuffle candidates, clear history endpoint
- Remove 10-source cap; all sources are now processed
- Increase max links per source from 10 to 15
- Extract article links in parallel (up to 5 concurrent) using JoinSet
- Shuffle candidate URLs after history filtering to interleave sources
- Add DELETE /api/v1/article-history endpoint to clear all history for a user

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 2d623c6ced docs: add spec for pipeline tweaks (parallel extraction, shuffle, clear history) 3 months ago
oabrivard 48957470ed test: update E2E test for new pipeline (remove deprecated settings) 3 months ago
oabrivard c4a4cd9987 feat: remove deprecated settings from frontend
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 7a8427316c feat: rewrite synthesis pipeline — per-article classify/summarize, no rewrite pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard 0b180eb75c refactor: remove old classification, rewrite, and article extraction prompts/schemas
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard bb716b5dc2 feat: add get_last_source_url + remove head_html from ScrapedContent
- Add get_last_source_url() to article_history db module for source rotation
- Remove head_html field from ScrapedContent struct and scrape_url function
- Fix synthesis.rs scrape_single_article_with_llm to pass empty string instead of removed field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard b2dbc3847a feat: add per-article classify/summarize prompt and schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard 825b793387 feat: drop source_diversity_window and use_llm_for_article_extraction settings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
oabrivard d3b63295f6 docs: add algorithm rewrite implementation plan (7 tasks) 3 months ago
oabrivard 1d5dc0596c docs: add spec for algorithm rewrite — per-article classify, no rewrite pass 3 months ago
oabrivard a2fe3f3310 feat: simplify LlmProvider trait to single call_llm method
Replace the three-method LlmProvider trait (generate_search_pass,
generate_rewrite_pass, supports_web_search) and ProviderCapabilities
with a single call_llm method. Update all three provider implementations
(Gemini, OpenAI, Anthropic) and all callers in synthesis.rs,
source_scraper.rs, and api_keys.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 months ago
oabrivard e4b76fb06a docs: add plan for simplifying LLM provider trait 3 months ago
oabrivard cb0b7620c9 docs: add spec for simplifying LLM provider trait to single call_llm method 3 months ago
oabrivard 0372a96822 docs: add algorithm.md describing full synthesis generation pipeline 3 months ago
oabrivard d9982b467c test: verify LLM call logs endpoint returns data after generation 3 months ago