You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
oabrivard 3d790e7ce7 feat: extract article URLs from JSON-LD structured data in source pages
Many modern sites (Hugo, WordPress, Next.js) load articles via JavaScript
but include full article URLs in JSON-LD schema.org markup in the <head>.
The scraper now extracts these first (highest quality), then falls back
to <a href> heuristic extraction. Supports ItemList, BlogPosting,
NewsArticle, @graph arrays, and mainEntity wrappers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2 months ago
.claude
audits/2026-03-27 Code improvement after a code review with Codex 2 months ago
backend feat: extract article URLs from JSON-LD structured data in source pages 2 months ago
docs fix: P2 audit items — use API client for stop, replace raw buttons, remove deprecated doc refs 2 months ago
e2e fix: return 204 No Content from preferred sources endpoint 2 months ago
frontend fix: swap date and source URL positions in article cards 2 months ago
scripts fix: run seed.ts before E2E tests to create test users and sessions 3 months ago
.env.example chore: remove SESSION_SECRET and wrap master_encryption_key in Arc 3 months ago
.gitignore
AGENTS.md
CLAUDE.md docs: update CLAUDE.md for consolidated documentation and current features 3 months ago
docker-compose.yml