- Full Obsidian vault content - Host configs (ice, grizzley, ubuntu, proxmox, truenas, panda, hyte) - Media stack documentation - Traefik HA setup - Automation scripts - Bachelor party planning
109 lines
4.1 KiB
Markdown
109 lines
4.1 KiB
Markdown
---
|
|
title: Traefik High Availability
|
|
created: 2026-04-28
|
|
updated: 2026-05-14
|
|
type: concept
|
|
tags: [concept, networking, services]
|
|
sources: [../../homelab/architecture.md, ../../platform-config/overview.md]
|
|
---
|
|
|
|
# Traefik High Availability
|
|
|
|
Two Traefik v3.6.7 instances provide ingress — one on ubuntu (primary router), one on grizzley (edge ACME). Certificates are synced via NFS.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Internet → Cloudflare DNS → *.tophermayor.com
|
|
↓
|
|
┌────────────────┴────────────────┐
|
|
↓ ↓
|
|
grizzley Traefik ubuntu Traefik
|
|
(edge ACME) (primary router)
|
|
192.168.50.84 192.168.50.61
|
|
│ │
|
|
│ TLS certs on NFS │
|
|
└──────────→ /mnt/truenas/traefik-certs/grizzley ←─┘
|
|
```
|
|
|
|
## Roles
|
|
|
|
| Instance | Host | Primary Role |
|
|
|----------|------|-------------|
|
|
| Traefik Pi | grizzley (192.168.50.84) | Edge ACME — generates wildcard certs via Cloudflare DNS challenge |
|
|
| Traefik (ubuntu) | ubuntu (192.168.50.61) | Primary router — handles ~90% of traffic, syncs certs from grizzley |
|
|
|
|
## Certificate Flow
|
|
|
|
1. Grizzley Traefik runs Cloudflare DNS challenge, writes certs to NFS mount `/mnt/truenas/traefik-certs/grizzley`
|
|
2. Ubuntu Traefik references same certs via NFS share
|
|
3. Both instances serve the same wildcard `*.tophermayor.com` cert
|
|
|
|
## Dynamic Config Files
|
|
|
|
Located in `homelab/ubuntu/traefik/config/dynamic/`:
|
|
|
|
| File | Services |
|
|
|------|----------|
|
|
| `canonical-hosts.yml` | Grizzley ingress proxy, PVE OpenCode |
|
|
| `gitea.yml` | gitea.tophermayor.com |
|
|
| `immich.yml` | immich.tophermayor.com |
|
|
| `jellyfin.yml` | jellyfin.tophermayor.com |
|
|
| `media-stack.yml` | Sonarr, Radarr, SABnzbd, Prowlarr, qBittorrent |
|
|
| `middlewares.yml` | 30+ middleware definitions |
|
|
| `opencode.yml` | opencode.tophermayor.com |
|
|
| `proxmox.yml` | proxmox.local.tophermayor.com |
|
|
| `homepage-widgets.yml` | Homepage service definitions |
|
|
| `audiobookshelf.yml` | Audiobookshelf (CT 108) |
|
|
| `jellyseerr.yml` | Jellyseerr (CT 106) |
|
|
| `kavita.yml` | Kavita (CT 108) |
|
|
| `navidrome.yml` | Navidrome (CT 107) |
|
|
| `stremio.yml` | Stremio Server |
|
|
|
|
## Common Middlewares
|
|
|
|
| Middleware | Purpose |
|
|
|------------|---------|
|
|
| `local-only@file` | Restrict to local network IPs |
|
|
| `authentik-auth@file` | SSO authentication |
|
|
| `security-headers@file` | Add security headers |
|
|
| `crowdsec-bouncer@file` | Rate limiting and threat protection |
|
|
|
|
## Entry Points
|
|
|
|
- `web` — port 80, HTTP → HTTPS redirect
|
|
- `websecure` — port 443, TLS termination
|
|
- `metrics` — port 8080, Prometheus metrics
|
|
|
|
## Outage Postmortem: 2026-05-14
|
|
|
|
**Severity:** Complete file provider failure — all `@file` routers and dependent `@docker` routers offline.
|
|
|
|
**Root Cause:** Media migration wrote 7 YAML dynamic config files with mangled backtick quoting, causing Traefik's file provider to fail parsing entirely.
|
|
|
|
**Affected Files:**
|
|
- `homepage-widgets.yml`
|
|
- `audiobookshelf.yml`
|
|
- `jellyseerr.yml`
|
|
- `kavita.yml`
|
|
- `navidrome.yml`
|
|
- `stremio.yml`
|
|
- `media-stack.yml`
|
|
|
|
**Impact:**
|
|
- ALL `@file` routers down (no traffic routed to static-defined services)
|
|
- ALL `@docker` routers depending on `local-only@file` middleware also failed
|
|
- Homepage, media services, and any service using file-defined middlewares unreachable
|
|
|
|
**Fix:** Rewrote all 7 YAML files with correct quoting. Renamed conflicting service names in `homepage-widgets.yml` that were colliding with other provider definitions.
|
|
|
|
**Lesson:** Traefik file provider is all-or-nothing — one broken YAML file crashes the entire provider, taking down all file-defined routers and middlewares (even unrelated ones). Validate YAML before deploying.
|
|
|
|
## Related
|
|
|
|
- [[traefik]] — Traefik entity page
|
|
- [[grizzley]] — RPi5 edge node running edge Traefik
|
|
- [[ubuntu]] — Primary Docker host running primary Traefik
|
|
- [[truenas]] — NFS storage for cert sync
|
|
- [[docker-traefik-stack]] — Docker, Traefik, and container orchestration
|