Files
hermes-ice/homelab/concepts/traefik-ha.md
Hermes Agent e4d91aadf9 Initial commit: homelab infrastructure wiki
- Full Obsidian vault content
- Host configs (ice, grizzley, ubuntu, proxmox, truenas, panda, hyte)
- Media stack documentation
- Traefik HA setup
- Automation scripts
- Bachelor party planning
2026-05-24 16:08:40 -07:00

4.1 KiB

title, created, updated, type, tags, sources
title created updated type tags sources
Traefik High Availability 2026-04-28 2026-05-14 concept
concept
networking
services
../../homelab/architecture.md
../../platform-config/overview.md

Traefik High Availability

Two Traefik v3.6.7 instances provide ingress — one on ubuntu (primary router), one on grizzley (edge ACME). Certificates are synced via NFS.

Architecture

Internet → Cloudflare DNS → *.tophermayor.com
                               ↓
              ┌────────────────┴────────────────┐
              ↓                                  ↓
    grizzley Traefik                    ubuntu Traefik
    (edge ACME)                         (primary router)
    192.168.50.84                      192.168.50.61
              │                                  │
              │  TLS certs on NFS               │
              └──────────→ /mnt/truenas/traefik-certs/grizzley ←─┘

Roles

Instance Host Primary Role
Traefik Pi grizzley (192.168.50.84) Edge ACME — generates wildcard certs via Cloudflare DNS challenge
Traefik (ubuntu) ubuntu (192.168.50.61) Primary router — handles ~90% of traffic, syncs certs from grizzley

Certificate Flow

  1. Grizzley Traefik runs Cloudflare DNS challenge, writes certs to NFS mount /mnt/truenas/traefik-certs/grizzley
  2. Ubuntu Traefik references same certs via NFS share
  3. Both instances serve the same wildcard *.tophermayor.com cert

Dynamic Config Files

Located in homelab/ubuntu/traefik/config/dynamic/:

File Services
canonical-hosts.yml Grizzley ingress proxy, PVE OpenCode
gitea.yml gitea.tophermayor.com
immich.yml immich.tophermayor.com
jellyfin.yml jellyfin.tophermayor.com
media-stack.yml Sonarr, Radarr, SABnzbd, Prowlarr, qBittorrent
middlewares.yml 30+ middleware definitions
opencode.yml opencode.tophermayor.com
proxmox.yml proxmox.local.tophermayor.com
homepage-widgets.yml Homepage service definitions
audiobookshelf.yml Audiobookshelf (CT 108)
jellyseerr.yml Jellyseerr (CT 106)
kavita.yml Kavita (CT 108)
navidrome.yml Navidrome (CT 107)
stremio.yml Stremio Server

Common Middlewares

Middleware Purpose
local-only@file Restrict to local network IPs
authentik-auth@file SSO authentication
security-headers@file Add security headers
crowdsec-bouncer@file Rate limiting and threat protection

Entry Points

  • web — port 80, HTTP → HTTPS redirect
  • websecure — port 443, TLS termination
  • metrics — port 8080, Prometheus metrics

Outage Postmortem: 2026-05-14

Severity: Complete file provider failure — all @file routers and dependent @docker routers offline.

Root Cause: Media migration wrote 7 YAML dynamic config files with mangled backtick quoting, causing Traefik's file provider to fail parsing entirely.

Affected Files:

  • homepage-widgets.yml
  • audiobookshelf.yml
  • jellyseerr.yml
  • kavita.yml
  • navidrome.yml
  • stremio.yml
  • media-stack.yml

Impact:

  • ALL @file routers down (no traffic routed to static-defined services)
  • ALL @docker routers depending on local-only@file middleware also failed
  • Homepage, media services, and any service using file-defined middlewares unreachable

Fix: Rewrote all 7 YAML files with correct quoting. Renamed conflicting service names in homepage-widgets.yml that were colliding with other provider definitions.

Lesson: Traefik file provider is all-or-nothing — one broken YAML file crashes the entire provider, taking down all file-defined routers and middlewares (even unrelated ones). Validate YAML before deploying.

  • traefik — Traefik entity page
  • grizzley — RPi5 edge node running edge Traefik
  • ubuntu — Primary Docker host running primary Traefik
  • truenas — NFS storage for cert sync
  • docker-traefik-stack — Docker, Traefik, and container orchestration