The LLM was returning only 1 article per category despite the user setting 4.
- Added minItems/maxItems to the category array schema (enforced by OpenAI strict mode)
- Changed prompt from "au maximum N actualites" to "exactement N actualites"
- Schema builder now takes max_items_per_category parameter
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LLM output occasionally contains \u0000 null bytes (e.g., "annonc\u0000...")
which PostgreSQL rejects in JSONB columns. Added sanitize_json_null_bytes()
that recursively strips null bytes from all string values before DB insert.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The scraper client (build_scraper_client) has a 15s timeout appropriate for web
scraping, but LLM API calls — especially with web search — take 30-60s. LLM
providers now build their own reqwest client with 120s timeout via build_llm_client().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bugs fixed:
- resolve_model queried non-existent admin_provider_models table (use JSONB query on admin_providers)
- key_prefix VARCHAR(10) too short for 11-char prefix (migration to VARCHAR(12))
- API key test schema missing additionalProperties: false (OpenAI strict mode)
- CSP missing font-src data: directive (PDF font embedding blocked)
- Magic link URL not logged in test mode (can't verify without real email)
- Rust 1.85 Docker image too old for dependencies (bumped to 1.88)
Tests added to prevent recurrence:
- schema_meets_openai_strict_mode_requirements: validates additionalProperties on all objects
- key_prefix_full_length_stored_in_db: verifies 11-char prefix survives DB round-trip
- generate_pipeline_resolves_model_from_admin_config: exercises full generation pipeline
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract auth::create_and_send_magic_link() to deduplicate token rollback logic
- Replace (i32, i32, RateLimiter) tuple with named UserRateLimitEntry struct
- Use DashMap entry API for atomic rate limiter lookup (fixes TOCTOU race)
- Pre-allocate scraper body Vec from Content-Length when available
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire hardened scraper client into runtime (SSRF redirect validation was defined but unused)
- Stream scraper body with per-chunk size limit instead of post-download check (DoS/OOM)
- Persist user rate-limit overrides across generation jobs via AppState DashMap
- Roll back magic-link token on email send failure to prevent quota exhaustion
- Fix API error UX: prefer human message over machine error code in frontend
- Unwrap GET /syntheses { items } wrapper in frontend API layer (contract mismatch)
- Bind Postgres to localhost in docker-compose (was exposed on all interfaces)
- Fix CLAUDE.md: runtime queries not compile-time, 10 migrations not 9
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug fix:
- Per-generation rate limiter was creating a new instance on every check,
making user rate limit overrides non-functional. Fixed by creating the
limiter once at pipeline start and reusing for both passes.
Simplifications:
- Extract spawn_task closure in scrape_articles (deduplicate spawn blocks)
- Use idiomatic if let Ok(...) instead of if let Some(..).ok() in scraper
- Replace manual loop with iterator chain in export_keys handler
- Simplify check_rate_limit to single boolean check
- Simplify handleImport settings merge (spread already provides defaults)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- resolve_provider_and_key() now respects user ai_provider preference
- Dual model resolution: ai_model for search pass, ai_model_writing for rewrite pass
- Per-generation rate limiter with user override support
- Homepage URL filter removes domain-only URLs after search pass
- ScrapedNewsItem gains original_title field populated from page <title>
- SynthesisResponse::try_from handles null sections gracefully (returns empty vec)
- Search prompt warns LLM against returning homepage URLs
- Rewrite prompt instructs LLM to use originalTitle with language preservation rules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- OpenAiProvider: Responses API with web_search_preview (pass 1),
Chat Completions with json_schema structured output (pass 2)
- AnthropicProvider: Messages API with web_search tool (pass 1),
schema-in-prompt for structured output, code fence stripping (pass 2)
- Pipeline adaptation: skip scrape+rewrite when >70% of search URLs are valid
- Provider factory updated for all three providers
- Error sanitization extended for Anthropic key patterns (sk-ant-)
- 44 new unit tests (OpenAI, Anthropic, factory, pipeline heuristic)
Frontend:
- Provider-specific info text below model selection
- Web search support badges (green/gray)
- Generate page shows selected provider and model
- Warning banner when provider lacks web search
- Provider utility module with 10 tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>