docs: add Autre fill-up implementation plan
parent
aebd436a91
commit
f5f0656604
@ -0,0 +1,256 @@
|
||||
# "Autre" Fill-Up — Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** When total synthesis articles are below 75% capacity, expand "Autre" with overflow articles from classification.
|
||||
|
||||
**Architecture:** Modify `parse_classification_response` to return overflow. Add fill-up logic between Phase 2 and rewrite pass. Single-file change.
|
||||
|
||||
**Tech Stack:** Rust
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-03-24-autre-fillup-design.md`
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Modify `parse_classification_response` to collect overflow
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/services/synthesis.rs`
|
||||
|
||||
- [ ] **Step 1: Add constant**
|
||||
|
||||
At the top of the helper functions section, add:
|
||||
```rust
|
||||
/// Minimum fill ratio for synthesis. If total articles are below this percentage
|
||||
/// of the maximum capacity, overflow articles are added to "Autre" to compensate.
|
||||
const SYNTHESIS_MIN_FILL_RATIO: f64 = 0.75;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Change `parse_classification_response` return type and body**
|
||||
|
||||
Change return type from `HashMap<String, Vec<ScrapedNewsItem>>` to `(HashMap<String, Vec<ScrapedNewsItem>>, Vec<ScrapedNewsItem>)`.
|
||||
|
||||
Add `let mut overflow: Vec<ScrapedNewsItem> = Vec::new();` at the start of the function.
|
||||
|
||||
At the two drop points where articles are silently discarded, push to overflow instead:
|
||||
|
||||
**Drop point 1 (line ~1735-1743)** — category full AND Autre full:
|
||||
```rust
|
||||
if filled >= max {
|
||||
let autre_filled = filled_counts.get("Autre").copied().unwrap_or(0);
|
||||
if autre_filled < max {
|
||||
result.entry("category_autre".to_string()).or_default().push(articles[index].clone());
|
||||
*filled_counts.entry("Autre".to_string()).or_insert(0) += 1;
|
||||
assigned_indices.insert(index);
|
||||
} else {
|
||||
// Both category and Autre full — collect as overflow
|
||||
overflow.push(articles[index].clone());
|
||||
assigned_indices.insert(index);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
```
|
||||
|
||||
**Drop point 2 (line ~1752-1759)** — unclassified, Autre full:
|
||||
```rust
|
||||
for (i, article) in articles.iter().enumerate() {
|
||||
if !assigned_indices.contains(&i) {
|
||||
let autre_filled = filled_counts.get("Autre").copied().unwrap_or(0);
|
||||
if autre_filled < max {
|
||||
result.entry("category_autre".to_string()).or_default().push(article.clone());
|
||||
*filled_counts.entry("Autre".to_string()).or_insert(0) += 1;
|
||||
} else {
|
||||
// Autre full — collect as overflow
|
||||
overflow.push(article.clone());
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Change the return from `result` to `(result, overflow)`.
|
||||
|
||||
- [ ] **Step 3: Update 2 production call sites**
|
||||
|
||||
**Phase 1 (line ~494):**
|
||||
```rust
|
||||
// Before:
|
||||
let phase1_classified = parse_classification_response(...);
|
||||
|
||||
// After:
|
||||
let (phase1_classified, phase1_overflow) = parse_classification_response(...);
|
||||
```
|
||||
|
||||
**Phase 2 (line ~701):**
|
||||
```rust
|
||||
// Before:
|
||||
let phase2_classified = parse_classification_response(...);
|
||||
|
||||
// After:
|
||||
let (phase2_classified, phase2_overflow) = parse_classification_response(...);
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Update 5 existing tests**
|
||||
|
||||
All 5 tests that call `parse_classification_response` need to destructure the tuple. Change each from:
|
||||
```rust
|
||||
let result = parse_classification_response(...);
|
||||
```
|
||||
To:
|
||||
```rust
|
||||
let (result, _overflow) = parse_classification_response(...);
|
||||
```
|
||||
|
||||
For the test `classification_respects_max_per_category`, also verify overflow is captured:
|
||||
```rust
|
||||
let (result, overflow) = parse_classification_response(&response, &articles, &categories, 2, &mut filled);
|
||||
assert_eq!(result.get("category_0").map(|v| v.len()), Some(2));
|
||||
assert!(result.get("category_autre").map(|v| v.len()).unwrap_or(0) > 0);
|
||||
// Articles that overflowed both AI News AND Autre should be in overflow
|
||||
assert!(!overflow.is_empty() || filled.get("Autre").copied().unwrap_or(0) <= 2);
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run tests**
|
||||
|
||||
Run: `cd backend && cargo test --lib`
|
||||
Expected: all tests pass
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/services/synthesis.rs
|
||||
git commit -m "feat: parse_classification_response collects overflow articles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add fill-up logic and accumulate overflow in pipeline
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/services/synthesis.rs`
|
||||
|
||||
- [ ] **Step 1: Accumulate overflow in `run_generation_inner`**
|
||||
|
||||
Add `let mut all_overflow: Vec<ScrapedNewsItem> = Vec::new();` near the other `all_scraped` initialization (around line 305).
|
||||
|
||||
After Phase 1 classification (where `phase1_overflow` is captured), add:
|
||||
```rust
|
||||
all_overflow.extend(phase1_overflow);
|
||||
```
|
||||
|
||||
After Phase 2 classification (where `phase2_overflow` is captured), add:
|
||||
```rust
|
||||
all_overflow.extend(phase2_overflow);
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add fill-up logic before rewrite pass**
|
||||
|
||||
Between the "COMBINED REWRITE PASS" header (line ~719) and the empty-check (line ~722), insert:
|
||||
|
||||
```rust
|
||||
// Fill-up: if total articles are below 75% of max, expand "Autre" with overflow
|
||||
let total_articles: usize = all_scraped.values().map(|v| v.len()).sum();
|
||||
let max_articles = settings.categories.len() * settings.max_items_per_category as usize;
|
||||
let target = (SYNTHESIS_MIN_FILL_RATIO * max_articles as f64).ceil() as usize;
|
||||
let shortfall = target.saturating_sub(total_articles);
|
||||
|
||||
if shortfall > 0 && !all_overflow.is_empty() {
|
||||
tracing::info!(
|
||||
total = total_articles,
|
||||
target = target,
|
||||
shortfall = shortfall,
|
||||
overflow_available = all_overflow.len(),
|
||||
"Synthesis under-filled, adding overflow to Autre"
|
||||
);
|
||||
|
||||
// Count domain occurrences across all categories for source diversity enforcement
|
||||
let mut domain_counts: HashMap<String, usize> = HashMap::new();
|
||||
for items in all_scraped.values() {
|
||||
for item in items {
|
||||
if let Some(domain) = extract_domain(&item.url) {
|
||||
*domain_counts.entry(domain).or_insert(0) += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let max_per_source = settings.max_articles_per_source as usize;
|
||||
let mut added = 0usize;
|
||||
|
||||
for article in all_overflow {
|
||||
if added >= shortfall {
|
||||
break;
|
||||
}
|
||||
// Enforce source diversity on overflow articles
|
||||
if let Some(domain) = extract_domain(&article.url) {
|
||||
let count = domain_counts.get(&domain).copied().unwrap_or(0);
|
||||
if count >= max_per_source {
|
||||
continue;
|
||||
}
|
||||
*domain_counts.entry(domain).or_insert(0) += 1;
|
||||
}
|
||||
all_scraped
|
||||
.entry("category_autre".to_string())
|
||||
.or_default()
|
||||
.push(article);
|
||||
added += 1;
|
||||
}
|
||||
|
||||
if added > 0 {
|
||||
tracing::info!(added = added, "Added overflow articles to Autre");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Add unit tests for fill-up calculation**
|
||||
|
||||
```rust
|
||||
// ── fill-up calculation tests ───────────────────────────────
|
||||
|
||||
#[test]
|
||||
fn fillup_target_calculation() {
|
||||
// 4 categories x 4 items = 16 max, 75% = 12
|
||||
let max = 4 * 4;
|
||||
let target = (0.75_f64 * max as f64).ceil() as usize;
|
||||
assert_eq!(target, 12);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fillup_shortfall_saturating() {
|
||||
// If total exceeds target, shortfall should be 0 (not panic)
|
||||
let target: usize = 12;
|
||||
let total: usize = 15;
|
||||
let shortfall = target.saturating_sub(total);
|
||||
assert_eq!(shortfall, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn classification_overflow_collected_when_all_full() {
|
||||
use crate::models::synthesis::ScrapedNewsItem;
|
||||
let articles: Vec<ScrapedNewsItem> = (0..6).map(|i| ScrapedNewsItem {
|
||||
title: format!("Art{}", i), url: format!("https://a.com/{}", i),
|
||||
summary: "s".into(), original_title: "t".into(), scraped_content: "c".into(),
|
||||
}).collect();
|
||||
let categories = vec!["AI News".to_string(), "Autre".to_string()];
|
||||
let response = serde_json::json!({
|
||||
"assignments": (0..6).map(|i| serde_json::json!({"index": i, "category": "AI News"})).collect::<Vec<_>>()
|
||||
});
|
||||
let mut filled = HashMap::new();
|
||||
let (result, overflow) = parse_classification_response(&response, &articles, &categories, 2, &mut filled);
|
||||
|
||||
// AI News capped at 2, Autre gets 2 overflow, remaining 2 go to overflow vec
|
||||
assert_eq!(result.get("category_0").map(|v| v.len()), Some(2));
|
||||
assert_eq!(result.get("category_autre").map(|v| v.len()), Some(2));
|
||||
assert_eq!(overflow.len(), 2, "2 articles should overflow when both AI News and Autre are full");
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run tests**
|
||||
|
||||
Run: `cd backend && cargo test --lib`
|
||||
Expected: all tests pass
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/services/synthesis.rs
|
||||
git commit -m "feat: Autre fill-up to 75% synthesis target with source diversity enforcement"
|
||||
```
|
||||
Loading…
Reference in New Issue