2.0 KiB
Design: Pipeline Tweaks — Parallel Extraction, Shuffle, Clear History
Date: 2026-03-25 Scope: 4 focused improvements to the synthesis pipeline and article history UI
1. Remove max source limit + extract 15 links
Currently max_sources = rotated_sources.len().min(10) limits to 10 sources, and max_links = 10 limits to 10 links per source.
Change to:
- Process ALL user sources (no
.min(10)cap) - Extract 15 links per source (
max_links = 15)
2. Parallel source extraction (concurrency 5)
Currently source pages are scraped sequentially in a for loop. Change to use JoinSet with max 5 concurrent extractions, same pattern as article scraping.
Both extract_article_links and extract_article_links_with_llm are async and return Result<Vec<String>>. The parallel loop spawns tasks and collects results.
3. Shuffle candidates after dedup/history filter
After deduplication and history filtering, before url→source tracking, shuffle candidate_urls using rand::thread_rng(). This ensures articles from different sources are interleaved rather than processed source-by-source.
The rand crate is already a dependency.
4. Clear history button
New API endpoint: DELETE /api/v1/article-history — deletes ALL article_history entries for the authenticated user.
New DB function: delete_all_for_user(pool, user_id).
Frontend: "Effacer l'historique" button on the ArticleHistory page with a confirmation dialog.
Files to Modify
- Modify:
backend/src/services/synthesis.rs— remove source cap, change to 15 links, parallel extraction with JoinSet, add shuffle - Modify:
backend/src/db/article_history.rs— adddelete_all_for_user - Create: handler for DELETE endpoint (add to existing
article_history.rshandler) - Modify:
backend/src/router.rs— add DELETE route - Modify:
frontend/src/api/articleHistory.ts— addclearAllmethod - Modify:
frontend/src/pages/ArticleHistory.tsx— add clear button with confirmation - Modify:
frontend/src/i18n/fr.ts— add labels