ai_synth

Commit Graph

Author	SHA1	Message	Date
oabrivard	45c9e71589	fix: enforce max_items_per_category in JSON schema and prompt The LLM was returning only 1 article per category despite the user setting 4. - Added minItems/maxItems to the category array schema (enforced by OpenAI strict mode) - Changed prompt from "au maximum N actualites" to "exactement N actualites" - Schema builder now takes max_items_per_category parameter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	0b0702de39	fix: strip null bytes from LLM output before saving to PostgreSQL JSONB LLM output occasionally contains \u0000 null bytes (e.g., "annonc\u0000...") which PostgreSQL rejects in JSONB columns. Added sanitize_json_null_bytes() that recursively strips null bytes from all string values before DB insert. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	3fe667591d	fix: LLM providers use own HTTP client with 120s timeout (was sharing scraper's 15s) The scraper client (build_scraper_client) has a 15s timeout appropriate for web scraping, but LLM API calls — especially with web search — take 30-60s. LLM providers now build their own reqwest client with 120s timeout via build_llm_client(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	004f08f385	fix: runtime bugs found during first Docker run + integration tests Bugs fixed: - resolve_model queried non-existent admin_provider_models table (use JSONB query on admin_providers) - key_prefix VARCHAR(10) too short for 11-char prefix (migration to VARCHAR(12)) - API key test schema missing additionalProperties: false (OpenAI strict mode) - CSP missing font-src data: directive (PDF font embedding blocked) - Magic link URL not logged in test mode (can't verify without real email) - Rust 1.85 Docker image too old for dependencies (bumped to 1.88) Tests added to prevent recurrence: - schema_meets_openai_strict_mode_requirements: validates additionalProperties on all objects - key_prefix_full_length_stored_in_db: verifies 11-char prefix survives DB round-trip - generate_pipeline_resolves_model_from_admin_config: exercises full generation pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	b961f82f01	refactor: add UserRateLimitEntry constructor and settings_changed method Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	c1f2f1456f	refactor: simplify recent changes — extract helper, named struct, atomic entry, pre-alloc - Extract auth::create_and_send_magic_link() to deduplicate token rollback logic - Replace (i32, i32, RateLimiter) tuple with named UserRateLimitEntry struct - Use DashMap entry API for atomic rate limiter lookup (fixes TOCTOU race) - Pre-allocate scraper body Vec from Content-Length when available Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	54d54f2a06	fix: architect assessment remediation — 6 issues across backend, frontend, and infra - Wire hardened scraper client into runtime (SSRF redirect validation was defined but unused) - Stream scraper body with per-chunk size limit instead of post-download check (DoS/OOM) - Persist user rate-limit overrides across generation jobs via AppState DashMap - Roll back magic-link token on email send failure to prevent quota exhaustion - Fix API error UX: prefer human message over machine error code in frontend - Unwrap GET /syntheses { items } wrapper in frontend API layer (contract mismatch) - Bind Postgres to localhost in docker-compose (was exposed on all interfaces) - Fix CLAUDE.md: runtime queries not compile-time, 10 migrations not 9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	98528f51bd	Fix rate limiter bug, simplify v2 code Bug fix: - Per-generation rate limiter was creating a new instance on every check, making user rate limit overrides non-functional. Fixed by creating the limiter once at pipeline start and reusing for both passes. Simplifications: - Extract spawn_task closure in scrape_articles (deduplicate spawn blocks) - Use idiomatic if let Ok(...) instead of if let Some(..).ok() in scraper - Replace manual loop with iterator chain in export_keys handler - Simplify check_rate_limit to single boolean check - Simplify handleImport settings merge (spread already provides defaults) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	9b994e0528	v2: pipeline user model selection, rate limiter, URL filter, original title, null-safe sections - resolve_provider_and_key() now respects user ai_provider preference - Dual model resolution: ai_model for search pass, ai_model_writing for rewrite pass - Per-generation rate limiter with user override support - Homepage URL filter removes domain-only URLs after search pass - ScrapedNewsItem gains original_title field populated from page <title> - SynthesisResponse::try_from handles null sections gracefully (returns empty vec) - Search prompt warns LLM against returning homepage URLs - Rewrite prompt instructs LLM to use originalTitle with language preservation rules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	631bd43b9f	Phase 6: Multi-provider support with OpenAI and Anthropic Backend: - OpenAiProvider: Responses API with web_search_preview (pass 1), Chat Completions with json_schema structured output (pass 2) - AnthropicProvider: Messages API with web_search tool (pass 1), schema-in-prompt for structured output, code fence stripping (pass 2) - Pipeline adaptation: skip scrape+rewrite when >70% of search URLs are valid - Provider factory updated for all three providers - Error sanitization extended for Anthropic key patterns (sk-ant-) - 44 new unit tests (OpenAI, Anthropic, factory, pipeline heuristic) Frontend: - Provider-specific info text below model selection - Web search support badges (green/gray) - Generate page shows selected provider and model - Warning banner when provider lacks web search - Provider utility module with 10 tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	aa6f1ba76b	Phase 5: Generation pipeline with SSE progress, syntheses CRUD Backend: - Full 2-pass generation pipeline: LLM search -> URL scraping -> LLM rewrite - Async generation with tokio::spawn, JobStore with per-user concurrency limit - SSE progress streaming via axum::response::Sse + tokio::sync::watch - Syntheses CRUD: list (paginated), get (ownership check), delete - Prompt construction ported from original geminiService.ts - Parallel URL scraping with bounded concurrency (max 10) - Graceful partial failure handling (some URLs fail -> continue) - 36 new unit tests, 16 integration tests Frontend: - Home dashboard: synthesis card grid, week badges, delete with confirmation - Generate page: SSE-driven progress bar, step checklist, auto-redirect - Synthesis detail: section-by-section display, external links, delete - SSE client helper with auto-reconnect (exponential backoff) - Date utilities with French locale formatting Critical fixes applied: - SSE EventSource now sends credentials (withCredentials: true) - Gemini error logging sanitized to prevent API key leak in logs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago

11 Commits (45c9e715899a507284bb4cc6fdc4aa1418babf92)