# ComparAIson — Product Specification ## 1. Product Overview **ComparAIson** is a self-hosted web application that enables users to compare two or more items using AI-powered deep research. The system performs multi-source research, generates structured comparison data, and presents results through interactive visualizations. Completed comparisons are saved as posts on user profiles, creating a browsable library of research. ## 2. Problem Statement Comparing products, technologies, or services requires gathering data from multiple sources, synthesizing findings, and presenting them clearly. This is time-consuming and often produces inconsistent results. ComparAIson automates this process with LLM-powered research that produces structured, visual, and comparable outputs. ## 3. Target Users - **Developers** comparing frameworks, tools, cloud services - **Consumers** comparing products before purchase - **Researchers** comparing methodologies, papers, or approaches - **Teams** evaluating options for technical decisions ## 4. Core Features ### 4.1 AI Research Engine - Multi-item comparison (2-10 items) - Multi-dimensional scoring (5-8 dimensions per comparison) - Web search integration via Tavily API - LLM synthesis via OpenAI GPT-4o-mini or Perplexity Sonar - Automatic provider fallback chain - Structured JSON output with validation - Server-Sent Events for real-time progress ### 4.2 Interactive Visualizations - **Radar/Spider Chart** — Multi-dimensional overlay showing all items - **Grouped Bar Chart** — Side-by-side metric comparison - **Comparison Table** — Feature matrix with color-coded cells - **Score Cards** — Animated progress bars with overall + per-dimension scores - **Pros/Cons Cards** — Expandable per-item breakdown ### 4.3 User System - Email + password authentication (Better Auth) - Session management (7-day expiry) - Protected routes for compare/profile actions - Public profile pages with comparison history ### 4.4 Social/Feed Features - Public comparisons feed (Explore page) - Per-comparison view count tracking - Tag-based categorization and filtering - Search across public comparisons - Shareable URLs for each comparison ## 5. Technical Constraints | Constraint | Value | |---|---| | Deployment target | Raspberry Pi ARM64, 8GB RAM | | Concurrent users | Low (homelab, <20) | | Total RAM budget | ~500MB-1GB (app + DB + reverse proxy) | | Cost target | Minimal (free tier APIs where possible) | | Network | Behind Traefik reverse proxy with HTTPS | ## 6. Data Model ### 6.1 Users (Better Auth managed) ``` users: id, name, email, emailVerified, image, createdAt, updatedAt sessions: id, userId (FK), token, expiresAt, createdAt, updatedAt ``` ### 6.2 Comparisons ``` comparisons: id, userId (FK), title, query, slug, status (researching|completed|failed), summary, overallData (JSONB), tags[], isPublic, viewCount, createdAt, updatedAt ``` ### 6.3 Comparison Items ``` comparison_items: id, comparisonId (FK), name, description, imageUrl, researchData (JSONB), scores (JSONB), pros[], cons[], order ``` ### 6.4 Comparison Dimensions ``` comparison_dimensions: id, comparisonId (FK), name, description, weight, order ``` ### 6.5 JSONB Schemas **overallData (on comparisons):** ```json { "title": "React vs Vue vs Svelte", "query": "for modern web development", "status": "completed", "summary": "...", "items": [ { "name": "React", "description": "...", "overallScore": 8.5, "dimensions": { "Performance": { "score": 8, "summary": "...", "details": "...", "pros": [], "cons": [] } }, "pros": ["..."], "cons": ["..."] } ], "dimensions": ["Performance", "Developer Experience", "Ecosystem", ...] } ``` **researchData (on comparison_items):** Full `ItemResearch` object including dimensions, sources, and scores. ## 7. LLM Research Pipeline ### 7.1 Flow ``` User submits query → Parse request (validate items ≥ 2) → Detect available providers (Tavily? Perplexity? OpenAI?) → If Tavily available: search each item individually → Synthesize via best available provider: Priority 1: Tavily search + Perplexity synthesis Priority 2: Tavily search + OpenAI synthesis Priority 3: OpenAI only (no web search) → Validate structured JSON output → Persist to database → Stream results to client ``` ### 7.2 Provider Details | Provider | Role | Model | Cost | |---|---|---|---| | Tavily | Web search | Search API | ~$0.005/search | | Perplexity | Synthesis | Sonar | ~$0.002/query | | OpenAI | Synthesis | GPT-4o-mini | ~$0.15/1M tokens | ### 7.3 Progress Stages (SSE) 1. `parsing` — Validating query and extracting items 2. `searching` — Running web search for each item (Tavily only) 3. `researching` — Processing research per item 4. `synthesizing` — LLM generating structured comparison 5. `complete` — Final result with all data 6. `error` — Failure with error message ## 8. Security Considerations - Auth middleware protects `/compare` and `/profile` routes - Session tokens stored in HTTP-only cookies - API keys never exposed to client (server-only LLM calls) - Input validation on all API endpoints (min 2 items, max 10) - SQL injection prevented via Drizzle ORM parameterized queries - CSRF protection via Better Auth - Rate limiting placeholder in compare API route ## 9. Future Considerations - [ ] OAuth providers (Google, GitHub) - [ ] Comparison comments/likes - [ ] Export to PDF/image - [ ] Embeddable comparison widgets - [ ] Comparison templates - [ ] Batch comparison queue for heavy loads - [ ] Local Ollama fallback for offline operation