You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Add source_url field to ScrapedNewsItem and a trace_article helper that inserts into article_history with full provenance metadata. Instrument Phase 1 (empty content, history dedup, source diversity) and Phase 2 (homepage filter, cross-phase dedup, history dedup, empty content) so every dropped article is recorded with its filter reason. Replace the old insert_urls call with per-article trace_article calls for used articles, preserving dedup semantics via url_hash. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
3 months ago | |
|---|---|---|
| .. | ||
| migrations | 3 months ago | |
| src | 3 months ago | |
| tests | 3 months ago | |
| Cargo.lock | ||
| Cargo.toml | ||
| Dockerfile | ||