Files
comparaison/docs/specs.md

5.6 KiB

ComparAIson — Product Specification

1. Product Overview

ComparAIson is a self-hosted web application that enables users to compare two or more items using AI-powered deep research. The system performs multi-source research, generates structured comparison data, and presents results through interactive visualizations. Completed comparisons are saved as posts on user profiles, creating a browsable library of research.

2. Problem Statement

Comparing products, technologies, or services requires gathering data from multiple sources, synthesizing findings, and presenting them clearly. This is time-consuming and often produces inconsistent results. ComparAIson automates this process with LLM-powered research that produces structured, visual, and comparable outputs.

3. Target Users

  • Developers comparing frameworks, tools, cloud services
  • Consumers comparing products before purchase
  • Researchers comparing methodologies, papers, or approaches
  • Teams evaluating options for technical decisions

4. Core Features

4.1 AI Research Engine

  • Multi-item comparison (2-10 items)
  • Multi-dimensional scoring (5-8 dimensions per comparison)
  • Web search integration via Tavily API
  • LLM synthesis via OpenAI GPT-4o-mini or Perplexity Sonar
  • Automatic provider fallback chain
  • Structured JSON output with validation
  • Server-Sent Events for real-time progress

4.2 Interactive Visualizations

  • Radar/Spider Chart — Multi-dimensional overlay showing all items
  • Grouped Bar Chart — Side-by-side metric comparison
  • Comparison Table — Feature matrix with color-coded cells
  • Score Cards — Animated progress bars with overall + per-dimension scores
  • Pros/Cons Cards — Expandable per-item breakdown

4.3 User System

  • Email + password authentication (Better Auth)
  • Session management (7-day expiry)
  • Protected routes for compare/profile actions
  • Public profile pages with comparison history

4.4 Social/Feed Features

  • Public comparisons feed (Explore page)
  • Per-comparison view count tracking
  • Tag-based categorization and filtering
  • Search across public comparisons
  • Shareable URLs for each comparison

5. Technical Constraints

Constraint Value
Deployment target Raspberry Pi ARM64, 8GB RAM
Concurrent users Low (homelab, <20)
Total RAM budget ~500MB-1GB (app + DB + reverse proxy)
Cost target Minimal (free tier APIs where possible)
Network Behind Traefik reverse proxy with HTTPS

6. Data Model

6.1 Users (Better Auth managed)

users: id, name, email, emailVerified, image, createdAt, updatedAt
sessions: id, userId (FK), token, expiresAt, createdAt, updatedAt

6.2 Comparisons

comparisons: id, userId (FK), title, query, slug, status (researching|completed|failed),
             summary, overallData (JSONB), tags[], isPublic, viewCount, createdAt, updatedAt

6.3 Comparison Items

comparison_items: id, comparisonId (FK), name, description, imageUrl,
                  researchData (JSONB), scores (JSONB), pros[], cons[], order

6.4 Comparison Dimensions

comparison_dimensions: id, comparisonId (FK), name, description, weight, order

6.5 JSONB Schemas

overallData (on comparisons):

{
  "title": "React vs Vue vs Svelte",
  "query": "for modern web development",
  "status": "completed",
  "summary": "...",
  "items": [
    {
      "name": "React",
      "description": "...",
      "overallScore": 8.5,
      "dimensions": {
        "Performance": { "score": 8, "summary": "...", "details": "...", "pros": [], "cons": [] }
      },
      "pros": ["..."],
      "cons": ["..."]
    }
  ],
  "dimensions": ["Performance", "Developer Experience", "Ecosystem", ...]
}

researchData (on comparison_items): Full ItemResearch object including dimensions, sources, and scores.

7. LLM Research Pipeline

7.1 Flow

User submits query
  → Parse request (validate items ≥ 2)
  → Detect available providers (Tavily? Perplexity? OpenAI?)
  → If Tavily available: search each item individually
  → Synthesize via best available provider:
      Priority 1: Tavily search + Perplexity synthesis
      Priority 2: Tavily search + OpenAI synthesis
      Priority 3: OpenAI only (no web search)
  → Validate structured JSON output
  → Persist to database
  → Stream results to client

7.2 Provider Details

Provider Role Model Cost
Tavily Web search Search API ~$0.005/search
Perplexity Synthesis Sonar ~$0.002/query
OpenAI Synthesis GPT-4o-mini ~$0.15/1M tokens

7.3 Progress Stages (SSE)

  1. parsing — Validating query and extracting items
  2. searching — Running web search for each item (Tavily only)
  3. researching — Processing research per item
  4. synthesizing — LLM generating structured comparison
  5. complete — Final result with all data
  6. error — Failure with error message

8. Security Considerations

  • Auth middleware protects /compare and /profile routes
  • Session tokens stored in HTTP-only cookies
  • API keys never exposed to client (server-only LLM calls)
  • Input validation on all API endpoints (min 2 items, max 10)
  • SQL injection prevented via Drizzle ORM parameterized queries
  • CSRF protection via Better Auth
  • Rate limiting placeholder in compare API route

9. Future Considerations

  • OAuth providers (Google, GitHub)
  • Comparison comments/likes
  • Export to PDF/image
  • Embeddable comparison widgets
  • Comparison templates
  • Batch comparison queue for heavy loads
  • Local Ollama fallback for offline operation