Initial commit: homelab infrastructure wiki
- Full Obsidian vault content - Host configs (ice, grizzley, ubuntu, proxmox, truenas, panda, hyte) - Media stack documentation - Traefik HA setup - Automation scripts - Bachelor party planning
This commit is contained in:
108
homelab/concepts/traefik-ha.md
Normal file
108
homelab/concepts/traefik-ha.md
Normal file
@@ -0,0 +1,108 @@
|
||||
---
|
||||
title: Traefik High Availability
|
||||
created: 2026-04-28
|
||||
updated: 2026-05-14
|
||||
type: concept
|
||||
tags: [concept, networking, services]
|
||||
sources: [../../homelab/architecture.md, ../../platform-config/overview.md]
|
||||
---
|
||||
|
||||
# Traefik High Availability
|
||||
|
||||
Two Traefik v3.6.7 instances provide ingress — one on ubuntu (primary router), one on grizzley (edge ACME). Certificates are synced via NFS.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Internet → Cloudflare DNS → *.tophermayor.com
|
||||
↓
|
||||
┌────────────────┴────────────────┐
|
||||
↓ ↓
|
||||
grizzley Traefik ubuntu Traefik
|
||||
(edge ACME) (primary router)
|
||||
192.168.50.84 192.168.50.61
|
||||
│ │
|
||||
│ TLS certs on NFS │
|
||||
└──────────→ /mnt/truenas/traefik-certs/grizzley ←─┘
|
||||
```
|
||||
|
||||
## Roles
|
||||
|
||||
| Instance | Host | Primary Role |
|
||||
|----------|------|-------------|
|
||||
| Traefik Pi | grizzley (192.168.50.84) | Edge ACME — generates wildcard certs via Cloudflare DNS challenge |
|
||||
| Traefik (ubuntu) | ubuntu (192.168.50.61) | Primary router — handles ~90% of traffic, syncs certs from grizzley |
|
||||
|
||||
## Certificate Flow
|
||||
|
||||
1. Grizzley Traefik runs Cloudflare DNS challenge, writes certs to NFS mount `/mnt/truenas/traefik-certs/grizzley`
|
||||
2. Ubuntu Traefik references same certs via NFS share
|
||||
3. Both instances serve the same wildcard `*.tophermayor.com` cert
|
||||
|
||||
## Dynamic Config Files
|
||||
|
||||
Located in `homelab/ubuntu/traefik/config/dynamic/`:
|
||||
|
||||
| File | Services |
|
||||
|------|----------|
|
||||
| `canonical-hosts.yml` | Grizzley ingress proxy, PVE OpenCode |
|
||||
| `gitea.yml` | gitea.tophermayor.com |
|
||||
| `immich.yml` | immich.tophermayor.com |
|
||||
| `jellyfin.yml` | jellyfin.tophermayor.com |
|
||||
| `media-stack.yml` | Sonarr, Radarr, SABnzbd, Prowlarr, qBittorrent |
|
||||
| `middlewares.yml` | 30+ middleware definitions |
|
||||
| `opencode.yml` | opencode.tophermayor.com |
|
||||
| `proxmox.yml` | proxmox.local.tophermayor.com |
|
||||
| `homepage-widgets.yml` | Homepage service definitions |
|
||||
| `audiobookshelf.yml` | Audiobookshelf (CT 108) |
|
||||
| `jellyseerr.yml` | Jellyseerr (CT 106) |
|
||||
| `kavita.yml` | Kavita (CT 108) |
|
||||
| `navidrome.yml` | Navidrome (CT 107) |
|
||||
| `stremio.yml` | Stremio Server |
|
||||
|
||||
## Common Middlewares
|
||||
|
||||
| Middleware | Purpose |
|
||||
|------------|---------|
|
||||
| `local-only@file` | Restrict to local network IPs |
|
||||
| `authentik-auth@file` | SSO authentication |
|
||||
| `security-headers@file` | Add security headers |
|
||||
| `crowdsec-bouncer@file` | Rate limiting and threat protection |
|
||||
|
||||
## Entry Points
|
||||
|
||||
- `web` — port 80, HTTP → HTTPS redirect
|
||||
- `websecure` — port 443, TLS termination
|
||||
- `metrics` — port 8080, Prometheus metrics
|
||||
|
||||
## Outage Postmortem: 2026-05-14
|
||||
|
||||
**Severity:** Complete file provider failure — all `@file` routers and dependent `@docker` routers offline.
|
||||
|
||||
**Root Cause:** Media migration wrote 7 YAML dynamic config files with mangled backtick quoting, causing Traefik's file provider to fail parsing entirely.
|
||||
|
||||
**Affected Files:**
|
||||
- `homepage-widgets.yml`
|
||||
- `audiobookshelf.yml`
|
||||
- `jellyseerr.yml`
|
||||
- `kavita.yml`
|
||||
- `navidrome.yml`
|
||||
- `stremio.yml`
|
||||
- `media-stack.yml`
|
||||
|
||||
**Impact:**
|
||||
- ALL `@file` routers down (no traffic routed to static-defined services)
|
||||
- ALL `@docker` routers depending on `local-only@file` middleware also failed
|
||||
- Homepage, media services, and any service using file-defined middlewares unreachable
|
||||
|
||||
**Fix:** Rewrote all 7 YAML files with correct quoting. Renamed conflicting service names in `homepage-widgets.yml` that were colliding with other provider definitions.
|
||||
|
||||
**Lesson:** Traefik file provider is all-or-nothing — one broken YAML file crashes the entire provider, taking down all file-defined routers and middlewares (even unrelated ones). Validate YAML before deploying.
|
||||
|
||||
## Related
|
||||
|
||||
- [[traefik]] — Traefik entity page
|
||||
- [[grizzley]] — RPi5 edge node running edge Traefik
|
||||
- [[ubuntu]] — Primary Docker host running primary Traefik
|
||||
- [[truenas]] — NFS storage for cert sync
|
||||
- [[docker-traefik-stack]] — Docker, Traefik, and container orchestration
|
||||
Reference in New Issue
Block a user