Files
comparaison/docs/specs.md

164 lines
5.6 KiB
Markdown

# ComparAIson — Product Specification
## 1. Product Overview
**ComparAIson** is a self-hosted web application that enables users to compare two or more items using AI-powered deep research. The system performs multi-source research, generates structured comparison data, and presents results through interactive visualizations. Completed comparisons are saved as posts on user profiles, creating a browsable library of research.
## 2. Problem Statement
Comparing products, technologies, or services requires gathering data from multiple sources, synthesizing findings, and presenting them clearly. This is time-consuming and often produces inconsistent results. ComparAIson automates this process with LLM-powered research that produces structured, visual, and comparable outputs.
## 3. Target Users
- **Developers** comparing frameworks, tools, cloud services
- **Consumers** comparing products before purchase
- **Researchers** comparing methodologies, papers, or approaches
- **Teams** evaluating options for technical decisions
## 4. Core Features
### 4.1 AI Research Engine
- Multi-item comparison (2-10 items)
- Multi-dimensional scoring (5-8 dimensions per comparison)
- Web search integration via Tavily API
- LLM synthesis via OpenAI GPT-4o-mini or Perplexity Sonar
- Automatic provider fallback chain
- Structured JSON output with validation
- Server-Sent Events for real-time progress
### 4.2 Interactive Visualizations
- **Radar/Spider Chart** — Multi-dimensional overlay showing all items
- **Grouped Bar Chart** — Side-by-side metric comparison
- **Comparison Table** — Feature matrix with color-coded cells
- **Score Cards** — Animated progress bars with overall + per-dimension scores
- **Pros/Cons Cards** — Expandable per-item breakdown
### 4.3 User System
- Email + password authentication (Better Auth)
- Session management (7-day expiry)
- Protected routes for compare/profile actions
- Public profile pages with comparison history
### 4.4 Social/Feed Features
- Public comparisons feed (Explore page)
- Per-comparison view count tracking
- Tag-based categorization and filtering
- Search across public comparisons
- Shareable URLs for each comparison
## 5. Technical Constraints
| Constraint | Value |
|---|---|
| Deployment target | Raspberry Pi ARM64, 8GB RAM |
| Concurrent users | Low (homelab, <20) |
| Total RAM budget | ~500MB-1GB (app + DB + reverse proxy) |
| Cost target | Minimal (free tier APIs where possible) |
| Network | Behind Traefik reverse proxy with HTTPS |
## 6. Data Model
### 6.1 Users (Better Auth managed)
```
users: id, name, email, emailVerified, image, createdAt, updatedAt
sessions: id, userId (FK), token, expiresAt, createdAt, updatedAt
```
### 6.2 Comparisons
```
comparisons: id, userId (FK), title, query, slug, status (researching|completed|failed),
summary, overallData (JSONB), tags[], isPublic, viewCount, createdAt, updatedAt
```
### 6.3 Comparison Items
```
comparison_items: id, comparisonId (FK), name, description, imageUrl,
researchData (JSONB), scores (JSONB), pros[], cons[], order
```
### 6.4 Comparison Dimensions
```
comparison_dimensions: id, comparisonId (FK), name, description, weight, order
```
### 6.5 JSONB Schemas
**overallData (on comparisons):**
```json
{
"title": "React vs Vue vs Svelte",
"query": "for modern web development",
"status": "completed",
"summary": "...",
"items": [
{
"name": "React",
"description": "...",
"overallScore": 8.5,
"dimensions": {
"Performance": { "score": 8, "summary": "...", "details": "...", "pros": [], "cons": [] }
},
"pros": ["..."],
"cons": ["..."]
}
],
"dimensions": ["Performance", "Developer Experience", "Ecosystem", ...]
}
```
**researchData (on comparison_items):**
Full `ItemResearch` object including dimensions, sources, and scores.
## 7. LLM Research Pipeline
### 7.1 Flow
```
User submits query
→ Parse request (validate items ≥ 2)
→ Detect available providers (Tavily? Perplexity? OpenAI?)
→ If Tavily available: search each item individually
→ Synthesize via best available provider:
Priority 1: Tavily search + Perplexity synthesis
Priority 2: Tavily search + OpenAI synthesis
Priority 3: OpenAI only (no web search)
→ Validate structured JSON output
→ Persist to database
→ Stream results to client
```
### 7.2 Provider Details
| Provider | Role | Model | Cost |
|---|---|---|---|
| Tavily | Web search | Search API | ~$0.005/search |
| Perplexity | Synthesis | Sonar | ~$0.002/query |
| OpenAI | Synthesis | GPT-4o-mini | ~$0.15/1M tokens |
### 7.3 Progress Stages (SSE)
1. `parsing` — Validating query and extracting items
2. `searching` — Running web search for each item (Tavily only)
3. `researching` — Processing research per item
4. `synthesizing` — LLM generating structured comparison
5. `complete` — Final result with all data
6. `error` — Failure with error message
## 8. Security Considerations
- Auth middleware protects `/compare` and `/profile` routes
- Session tokens stored in HTTP-only cookies
- API keys never exposed to client (server-only LLM calls)
- Input validation on all API endpoints (min 2 items, max 10)
- SQL injection prevented via Drizzle ORM parameterized queries
- CSRF protection via Better Auth
- Rate limiting placeholder in compare API route
## 9. Future Considerations
- [ ] OAuth providers (Google, GitHub)
- [ ] Comparison comments/likes
- [ ] Export to PDF/image
- [ ] Embeddable comparison widgets
- [ ] Comparison templates
- [ ] Batch comparison queue for heavy loads
- [ ] Local Ollama fallback for offline operation