title, created, updated, type, tags, sources
| title |
created |
updated |
type |
tags |
sources |
| AI Applications Pipeline |
2026-04-28 |
2026-04-28 |
concept |
|
| ../../homelab/architecture.md |
|
AI Applications Pipeline
Local AI/ML stack running on ubuntu with GPU acceleration (GTX 1080 8GB), plus AI-powered applications that use LLM inference.
Core AI Infrastructure
| Service |
URL |
Purpose |
| Ollama |
localhost:11434 |
Local LLM inference (GPU via GTX 1080) |
| Qdrant |
ubuntu:6333 |
Vector database for OpenCode cluster memory |
| Faster Whisper Server |
— |
Speech-to-text (Whisper) |
AI Applications (7 containers)
| Application |
Description |
| AI Job Pipeline (backend + frontend) |
AI task orchestration |
| AI Alert Aggregator (backend + frontend + postgres) |
Alert intelligence |
| AI Media Intelligence (backend) |
Media analysis |
| AI Subscriptions |
Subscription management |
| Homelab Inventory (backend) |
Infrastructure inventory |
Immich ML
| Component |
Description |
| Immich Server |
Photo/video management |
| Immich ML |
Machine learning on GPU |
| Immich Postgres |
Dedicated PostgreSQL (pgvecto-rs extension) |
| Immich Redis |
Caching |
OpenCode Embeddings
OpenCode instances across the cluster use:
- Ollama — generating embeddings for vector memory
- Qdrant — storing shared vector memory across OpenCode cluster
Related