You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Many modern sites (Hugo, WordPress, Next.js) load articles via JavaScript but include full article URLs in JSON-LD schema.org markup in the <head>. The scraper now extracts these first (highest quality), then falls back to <a href> heuristic extraction. Supports ItemList, BlogPosting, NewsArticle, @graph arrays, and mainEntity wrappers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> |
2 months ago | |
|---|---|---|
| .. | ||
| migrations | 3 months ago | |
| src | 2 months ago | |
| tests | 2 months ago | |
| Cargo.lock | 3 months ago | |
| Cargo.toml | 3 months ago | |
| Dockerfile | ||