ai_synth

Commit Graph

Author	SHA1	Message	Date
oabrivard	7f647bc656	refactor: extract JobStore to services/job_store.rs Moves JobEntry, JobStore, ProgressEvent, JOB_TTL, and emit_progress to a dedicated module. Updates imports in synthesis.rs, generation.rs, scheduler.rs, and app_state.rs. synthesis.rs re-exports for backward compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	384649b2b6	feat: add theme schedules — model, DB, CRUD handler, routes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	e43a4d2180	feat: add preferred sources — prioritized during synthesis generation Users can mark sources as preferred via star buttons on the theme page. Preferred sources are processed first in the pipeline (ordered before non-preferred in waves, shuffled separately then merged). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	6f3e6883c9	feat: add stop generation button — saves partial synthesis on cancel Adds Arc<AtomicBool> cancellation flag to JobStore/JobEntry. The pipeline checks the flag before each wave and after each batch, then saves whatever articles have been collected. A new POST /syntheses/generate/:job_id/stop endpoint sets the flag. The frontend shows a red stop button during generation and POSTs to the stop endpoint on click. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	196005a27b	feat: multi-theme Phase 1 — settings split, sources/syntheses theme_id, pipeline theme-aware Remove content settings from settings table (moved to themes). Add theme_id to sources and syntheses. Pipeline loads content settings from the selected theme. Generate endpoint requires theme_id. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	10b8d950b9	feat: add themes CRUD endpoints Implements GET/POST/PUT/DELETE /api/v1/themes handlers following the same patterns as sources.rs, registers the module, and wires up routes in the router. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	ccecaa2d13	refactor: add provider_override for pipeline dependency injection Adds an optional LlmProvider override to run_generation and run_generation_inner, allowing tests to inject a mock provider without touching real credentials or the provider-resolution path. Makes run_generation_inner pub so integration tests can call it directly. Production callers pass None and behaviour is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	2036c12b24	refactor: eliminate SettingsResponse struct, serialize UserSettings directly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	347558a278	fix: atomic job creation, 15min timeout, and panic handling - Replace iterating DashMap check with atomic DashSet insert in create_job to eliminate the race condition where double-click could create two concurrent jobs for the same user - Add release_user method called at end of generation task (normal, timeout, and panic paths) so the generating slot is always freed - Wrap run_generation in tokio::time::timeout(900s) to prevent hung LLM calls from blocking the generation slot forever - Spawn a second task to await the JoinHandle and call release_user + send error event if the generation task panics, preventing SSE clients from hanging indefinitely - Update cleanup_expired to also remove users from generating_users set - Update tests to call release_user after completion/error to match new contract Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	e05c2ae75a	feat: handle brave_search in API key test endpoint Add a branch in test_key to route brave_search provider to crate::services::brave_search::test_api_key instead of the LLM factory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	7cd867c650	fix: resolve all clippy warnings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	8d232c1ade	feat: split model selection — scraping vs websearch with GPT-5 models Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	7f7d584314	feat: parallel source extraction, shuffle candidates, clear history endpoint - Remove 10-source cap; all sources are now processed - Increase max links per source from 10 to 15 - Extract article links in parallel (up to 5 concurrent) using JoinSet - Shuffle candidate URLs after history filtering to interleave sources - Add DELETE /api/v1/article-history endpoint to clear all history for a user Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	a2fe3f3310	feat: simplify LlmProvider trait to single call_llm method Replace the three-method LlmProvider trait (generate_search_pass, generate_rewrite_pass, supports_web_search) and ProviderCapabilities with a single call_llm method. Update all three provider implementations (Gemini, OpenAI, Anthropic) and all callers in synthesis.rs, source_scraper.rs, and api_keys.rs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	dafec2591b	feat: API endpoint for LLM call logs by job_id Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	55fe828e58	feat: API endpoints for article history listing and provenance Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	3fe667591d	fix: LLM providers use own HTTP client with 120s timeout (was sharing scraper's 15s) The scraper client (build_scraper_client) has a 15s timeout appropriate for web scraping, but LLM API calls — especially with web search — take 30-60s. LLM providers now build their own reqwest client with 120s timeout via build_llm_client(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	004f08f385	fix: runtime bugs found during first Docker run + integration tests Bugs fixed: - resolve_model queried non-existent admin_provider_models table (use JSONB query on admin_providers) - key_prefix VARCHAR(10) too short for 11-char prefix (migration to VARCHAR(12)) - API key test schema missing additionalProperties: false (OpenAI strict mode) - CSP missing font-src data: directive (PDF font embedding blocked) - Magic link URL not logged in test mode (can't verify without real email) - Rust 1.85 Docker image too old for dependencies (bumped to 1.88) Tests added to prevent recurrence: - schema_meets_openai_strict_mode_requirements: validates additionalProperties on all objects - key_prefix_full_length_stored_in_db: verifies 11-char prefix survives DB round-trip - generate_pipeline_resolves_model_from_admin_config: exercises full generation pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	c1f2f1456f	refactor: simplify recent changes — extract helper, named struct, atomic entry, pre-alloc - Extract auth::create_and_send_magic_link() to deduplicate token rollback logic - Replace (i32, i32, RateLimiter) tuple with named UserRateLimitEntry struct - Use DashMap entry API for atomic rate limiter lookup (fixes TOCTOU race) - Pre-allocate scraper body Vec from Content-Length when available Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	54d54f2a06	fix: architect assessment remediation — 6 issues across backend, frontend, and infra - Wire hardened scraper client into runtime (SSRF redirect validation was defined but unused) - Stream scraper body with per-chunk size limit instead of post-download check (DoS/OOM) - Persist user rate-limit overrides across generation jobs via AppState DashMap - Roll back magic-link token on email send failure to prevent quota exhaustion - Fix API error UX: prefer human message over machine error code in frontend - Unwrap GET /syntheses { items } wrapper in frontend API layer (contract mismatch) - Bind Postgres to localhost in docker-compose (was exposed on all interfaces) - Fix CLAUDE.md: runtime queries not compile-time, 10 migrations not 9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	98528f51bd	Fix rate limiter bug, simplify v2 code Bug fix: - Per-generation rate limiter was creating a new instance on every check, making user rate limit overrides non-functional. Fixed by creating the limiter once at pipeline start and reusing for both passes. Simplifications: - Extract spawn_task closure in scrape_articles (deduplicate spawn blocks) - Use idiomatic if let Ok(...) instead of if let Some(..).ok() in scraper - Replace manual loop with iterator chain in export_keys handler - Simplify check_rate_limit to single boolean check - Simplify handleImport settings merge (spread already provides defaults) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	7eb24cfd9a	v2: API key export endpoint (POST, rate-limited) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	3 months ago
oabrivard	04819aa926	Simplify code: deduplicate patterns, fix captcha field name bug Bug fix: - Fix frontend sending captcha_token instead of turnstile_token in login/register requests (would cause 422 errors on auth) Backend simplifications: - Deduplicate VALID_PROVIDERS constant (provider.rs is now the single source) - Extract validate_display_name/validate_models helpers in provider model - Add From<UserSettings> for SettingsResponse, From<User> for AdminUserResponse - Consolidate Resend API call pattern into shared send_via_resend() - Extract do_bulk_import() for sources bulk/CSV import - Use idiomatic range.contains() for rate limit validation Frontend simplifications: - Consolidate file download logic (exportCsv reuses shared fetchFile/triggerDownload) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	1f9f7f39d7	Phase 7: Email sending via Resend + Markdown/PDF export Backend: - Synthesis email sending via Resend API with HTML template (inline CSS, tables-based for email client compatibility) + plain-text fallback - XSS prevention via html_escape() on all user content in email templates - Markdown export: clean format with headers, links, summaries - PDF export: printpdf with built-in Helvetica fonts, indigo color scheme, automatic page breaks, word wrapping - 3 new endpoints: send-email, export/markdown, export/pdf - All endpoints enforce ownership checks - Email validation using email_address crate - 24 new unit tests, 13 integration tests Frontend: - Email section on SynthesisDetail: input pre-filled with user email, send button with loading state, success/error feedback - Export buttons: Markdown + PDF with per-button loading states - File download via Blob + temporary anchor with Content-Disposition parsing - 6 new export tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	aa6f1ba76b	Phase 5: Generation pipeline with SSE progress, syntheses CRUD Backend: - Full 2-pass generation pipeline: LLM search -> URL scraping -> LLM rewrite - Async generation with tokio::spawn, JobStore with per-user concurrency limit - SSE progress streaming via axum::response::Sse + tokio::sync::watch - Syntheses CRUD: list (paginated), get (ownership check), delete - Prompt construction ported from original geminiService.ts - Parallel URL scraping with bounded concurrency (max 10) - Graceful partial failure handling (some URLs fail -> continue) - 36 new unit tests, 16 integration tests Frontend: - Home dashboard: synthesis card grid, week badges, delete with confirmation - Generate page: SSE-driven progress bar, step checklist, auto-redirect - Synthesis detail: section-by-section display, external links, delete - SSE client helper with auto-reconnect (exponential backoff) - Date utilities with French locale formatting Critical fixes applied: - SSE EventSource now sends credentials (withCredentials: true) - Gemini error logging sanitized to prevent API key leak in logs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	439e547367	Phase 4: LLM provider abstraction with Gemini, user API key encryption Backend: - LlmProvider async trait with generate_search_pass/generate_rewrite_pass - GeminiProvider: googleSearch grounding (pass 1), structured JSON output (pass 2) - AES-256-GCM encryption for user API keys at rest (per-key random nonces) - MasterKey with zeroize-on-drop (no Clone to prevent unzeroized copies) - User API key endpoints: list (prefix only), create/update, delete, test - Dynamic category schema builder for structured LLM output - Provider factory (Gemini implemented, OpenAI/Anthropic stubbed for Phase 6) - 37 new unit tests (encryption, schema, Gemini serialization, factory) - 17 integration tests (CRUD, encryption verification, ownership isolation) Frontend: - ApiKeyManager component: per-provider key management in Settings - Password input with show/hide toggle, key prefix display (monospace) - Test button validates key with minimal LLM call - Status badges (configured/not configured) - 11 new tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	5abbf9b9ad	Phase 3: Admin module with provider/model curation, rate limits, user management Backend: - Admin API: CRUD for providers, rate limits, user role management - Public config endpoint for enabled providers/models - AdminUser extractor enforces RBAC on all admin endpoints - Per-provider rate limiter with hot-reload from DB - Audit logging for all admin mutations - Seed data: Gemini, OpenAI, Anthropic providers with default models - Self-demotion prevention on role changes - 30 integration tests, 27 new unit tests Frontend: - Admin layout with sidebar navigation (providers, rate limits, users) - Provider management: enable/disable, model CRUD, default model selection - Rate limit configuration with effective rate display - User management with role badges and promote/demote - Admin link in navbar/mobile menu (visible only to admins) - Settings page: dynamic provider/model selection from admin config - 10 new tests (admin guard, config API) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	22ff026a4c	Fix Phase 2 critical issues: SSRF IPv6 gaps, body text filtering, CSV validation - Fix body text extraction to actually filter excluded elements (script, nav, footer, aside, etc.) using node ID tracking instead of unused HashSet - Add IPv6 reserved range checks to SSRF prevention: ULA (fc00::/7), documentation (2001:db8::/32), discard prefix (100::/64) - Add errors field to frontend BulkImportResponse type - Validate Content-Type on CSV multipart upload (reject non-text files) - Add 6 new unit tests for the fixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	3 months ago
oabrivard	2b75dc7049	Finished phase 2	3 months ago
oabrivard	a36e3732bf	Fixed critical problems from phase 1	3 months ago
oabrivard	355dbf6a5a	Finished phase 1	3 months ago

31 Commits (d3a4d2c577d9bb29f935c5527c12e838927baa94)