From 2d623c6ced53f8bebd33bed3054f73da4c926baf Mon Sep 17 00:00:00 2001 From: oabrivard Date: Wed, 25 Mar 2026 07:51:44 +0100 Subject: [PATCH] docs: add spec for pipeline tweaks (parallel extraction, shuffle, clear history) --- .../2026-03-25-pipeline-tweaks-design.md | 44 +++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 docs/superpowers/specs/2026-03-25-pipeline-tweaks-design.md diff --git a/docs/superpowers/specs/2026-03-25-pipeline-tweaks-design.md b/docs/superpowers/specs/2026-03-25-pipeline-tweaks-design.md new file mode 100644 index 0000000..1226a4a --- /dev/null +++ b/docs/superpowers/specs/2026-03-25-pipeline-tweaks-design.md @@ -0,0 +1,44 @@ +# Design: Pipeline Tweaks — Parallel Extraction, Shuffle, Clear History + +**Date**: 2026-03-25 +**Scope**: 4 focused improvements to the synthesis pipeline and article history UI + +--- + +## 1. Remove max source limit + extract 15 links + +Currently `max_sources = rotated_sources.len().min(10)` limits to 10 sources, and `max_links = 10` limits to 10 links per source. + +Change to: +- Process ALL user sources (no `.min(10)` cap) +- Extract 15 links per source (`max_links = 15`) + +## 2. Parallel source extraction (concurrency 5) + +Currently source pages are scraped sequentially in a `for` loop. Change to use `JoinSet` with max 5 concurrent extractions, same pattern as article scraping. + +Both `extract_article_links` and `extract_article_links_with_llm` are async and return `Result>`. The parallel loop spawns tasks and collects results. + +## 3. Shuffle candidates after dedup/history filter + +After deduplication and history filtering, before url→source tracking, shuffle `candidate_urls` using `rand::thread_rng()`. This ensures articles from different sources are interleaved rather than processed source-by-source. + +The `rand` crate is already a dependency. + +## 4. Clear history button + +New API endpoint: `DELETE /api/v1/article-history` — deletes ALL article_history entries for the authenticated user. + +New DB function: `delete_all_for_user(pool, user_id)`. + +Frontend: "Effacer l'historique" button on the ArticleHistory page with a confirmation dialog. + +## Files to Modify + +- **Modify:** `backend/src/services/synthesis.rs` — remove source cap, change to 15 links, parallel extraction with JoinSet, add shuffle +- **Modify:** `backend/src/db/article_history.rs` — add `delete_all_for_user` +- **Create:** handler for DELETE endpoint (add to existing `article_history.rs` handler) +- **Modify:** `backend/src/router.rs` — add DELETE route +- **Modify:** `frontend/src/api/articleHistory.ts` — add `clearAll` method +- **Modify:** `frontend/src/pages/ArticleHistory.tsx` — add clear button with confirmation +- **Modify:** `frontend/src/i18n/fr.ts` — add labels