Initial commit: homelab configuration and documentation
This commit is contained in:
283
docs/guides/traefik_fix_guide.md
Normal file
283
docs/guides/traefik_fix_guide.md
Normal file
@@ -0,0 +1,283 @@
|
||||
# Final Traefik v3 Setup and Fix Guide
|
||||
|
||||
This guide provides the complete, step-by-step process to cleanly remove any old Traefik configurations and deploy a fresh, working Traefik v3 setup on Docker Swarm.
|
||||
|
||||
**Follow these steps in order on your Docker Swarm manager node.**
|
||||
|
||||
---
|
||||
|
||||
### Step 1: Complete Removal of Old Traefik Components
|
||||
|
||||
First, we will ensure the environment is completely clean.
|
||||
|
||||
1. **Remove the Stack:**
|
||||
- In Portainer, go to "Stacks", select your `networking-stack`, and click **Remove**. Wait for it to be successfully removed.
|
||||
|
||||
2. **Remove the Docker Config:**
|
||||
- Run this command in your manager node's terminal:
|
||||
```zsh
|
||||
docker config rm traefik.yml
|
||||
```
|
||||
*(It's okay if this command says the config doesn't exist.)*
|
||||
|
||||
3. **Remove the Docker Volume:**
|
||||
- This will delete your old Let's Encrypt certificates, which is necessary for a clean start.
|
||||
```zsh
|
||||
docker volume rm traefik_letsencrypt
|
||||
```
|
||||
*(It's okay if this command says the volume doesn't exist.)*
|
||||
|
||||
4. **Remove the Local Config File (if it exists):**
|
||||
```zsh
|
||||
rm ./traefik.yml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Create the Correct Traefik v3 Configuration
|
||||
|
||||
We will use the `busybox` container method to create the configuration file.
|
||||
|
||||
1. **Create `traefik.yml`:**
|
||||
- **IMPORTANT:** Replace `your-email@example.com` with your actual email address in the block below.
|
||||
- Copy the entire multi-line block and paste it into your Zsh terminal.
|
||||
- After pasting, the terminal will show a `>` on a new line. This is normal. **Simply type `EOF` and press Enter** to finish the command.
|
||||
|
||||
```zsh
|
||||
# --- Creates the traefik.yml file in a temporary container and copies it out ---
|
||||
docker run --rm -i -v "$(pwd):/host" busybox sh -c 'cat > /host/traefik.yml <<\'EOF\'
|
||||
checkNewVersion: true
|
||||
sendAnonymousUsage: false
|
||||
|
||||
log:
|
||||
level: INFO
|
||||
|
||||
api:
|
||||
dashboard: true
|
||||
insecure: false
|
||||
|
||||
entryPoints:
|
||||
web:
|
||||
address: ":80"
|
||||
http:
|
||||
redirections:
|
||||
entryPoint:
|
||||
to: websecure
|
||||
scheme: https
|
||||
websecure:
|
||||
address: ":443"
|
||||
http:
|
||||
tls:
|
||||
certResolver: leresolver
|
||||
|
||||
providers:
|
||||
swarm: # <-- Use the swarm provider in Traefik v3
|
||||
endpoint: "unix:///var/run/docker.sock"
|
||||
network: traefik-public
|
||||
exposedByDefault: false
|
||||
|
||||
# Optionally keep the docker provider if you run non-swarm local containers.
|
||||
# docker:
|
||||
# network: traefik-public
|
||||
# exposedByDefault: false
|
||||
|
||||
certificatesResolvers:
|
||||
leresolver:
|
||||
acme:
|
||||
email: "your-email@example.com"
|
||||
storage: "/letsencrypt/acme.json"
|
||||
dnsChallenge:
|
||||
provider: duckdns
|
||||
delayBeforeCheck: 30s
|
||||
resolvers:
|
||||
- "192.168.1.196:53"
|
||||
- "192.168.1.245:53"
|
||||
- "192.168.1.62:53"
|
||||
EOF'
|
||||
```
|
||||
|
||||
2. **Create the Docker Swarm Config:**
|
||||
- This command ingests the file you just created into Swarm.
|
||||
```zsh
|
||||
docker config create traefik.yml ./traefik.yml
|
||||
```
|
||||
|
||||
3. **Create and Prepare the Let's Encrypt Volume:**
|
||||
- Create the volume:
|
||||
```zsh
|
||||
docker volume create traefik_letsencrypt
|
||||
```
|
||||
- Create the empty `acme.json` file with the correct permissions:
|
||||
```zsh
|
||||
docker run --rm -v traefik_letsencrypt:/letsencrypt busybox sh -c "touch /letsencrypt/acme.json && chmod 600 /letsencrypt/acme.json"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Deploy the Corrected `networking-stack`
|
||||
|
||||
1. **Deploy via Portainer:**
|
||||
- Go to "Stacks" > "Add stack".
|
||||
- Name it `networking-stack`.
|
||||
- Copy the YAML content below and paste it into the web editor.
|
||||
- **IMPORTANT:** Replace `YOUR_DUCKDNS_TOKEN` with your actual DuckDNS token.
|
||||
- Click "Deploy the stack".
|
||||
|
||||
```yaml
|
||||
version: '3.9'
|
||||
|
||||
networks:
|
||||
traefik-public:
|
||||
external: true
|
||||
|
||||
volumes:
|
||||
traefik_letsencrypt:
|
||||
external: true
|
||||
|
||||
configs:
|
||||
traefik_yml:
|
||||
external: true
|
||||
name: traefik.yml
|
||||
|
||||
services:
|
||||
traefik:
|
||||
image: traefik:latest # Or pin to traefik:v3.0 for stability
|
||||
ports:
|
||||
- "80:80"
|
||||
- "443:443"
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- traefik_letsencrypt:/letsencrypt
|
||||
networks:
|
||||
- traefik-public
|
||||
environment:
|
||||
- "DUCKDNS_TOKEN=YOUR_DUCKDNS_TOKEN"
|
||||
configs:
|
||||
- source: traefik_yml
|
||||
target: /traefik.yml
|
||||
deploy:
|
||||
labels:
|
||||
- "traefik.enable=true"
|
||||
- "traefik.http.routers.traefik.rule=Host(`traefik.sj98.duckdns.org`)"
|
||||
- "traefik.http.routers.traefik.entrypoints=websecure"
|
||||
- "traefik.http.routers.traefik.tls.certresolver=leresolver"
|
||||
- "traefik.http.routers.traefik.service=api@internal"
|
||||
placement:
|
||||
constraints:
|
||||
- node.role == manager
|
||||
|
||||
whoami:
|
||||
image: traefik/whoami
|
||||
networks:
|
||||
- traefik-public
|
||||
deploy:
|
||||
labels:
|
||||
- "traefik.enable=true"
|
||||
- "traefik.http.routers.whoami.rule=Host(`whoami.sj98.duckdns.org`)"
|
||||
- "traefik.http.routers.whoami.entrypoints=websecure"
|
||||
- "traefik.http.routers.whoami.tls.certresolver=leresolver"
|
||||
- "traefik.http.services.whoami.loadbalancer.server.port=80"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Verify and Redeploy Other Stacks
|
||||
|
||||
1. **Wait and Verify:**
|
||||
- Wait for 2-3 minutes for the stack to deploy and for the certificate to be issued.
|
||||
- Open your browser and navigate to `https://traefik.sj98.duckdns.org`. The Traefik dashboard should load.
|
||||
- You should see routers for `traefik` and `whoami`.
|
||||
|
||||
2. **Redeploy Corrected Stacks:**
|
||||
- Now that Traefik is working, go to Portainer and redeploy your `full-stack-complete.yml` and `monitoring-stack.yml` to apply the fixes we made earlier.
|
||||
- The services from those stacks (Paperless, Prometheus, etc.) should now appear in the Traefik dashboard and be accessible via their URLs.
|
||||
|
||||
### Chat GPT Fix
|
||||
⸻
|
||||
|
||||
Traefik Swarm Stack Fix Instructions
|
||||
|
||||
1. Verify Networks
|
||||
|
||||
Make sure all web-exposed services are attached to the traefik-public network:
|
||||
|
||||
networks:
|
||||
- traefik-public
|
||||
|
||||
Internal-only services (DB, Redis, etc.) should not be on Traefik network.
|
||||
|
||||
⸻
|
||||
|
||||
2. Assign Unique Router Names
|
||||
|
||||
Every service exposed via Traefik must have a unique router label:
|
||||
|
||||
labels:
|
||||
- "traefik.enable=true"
|
||||
- "traefik.http.routers.<service>-router.rule=Host(`<subdomain>.sj98.duckdns.org`)"
|
||||
- "traefik.http.routers.<service>-router.entrypoints=websecure"
|
||||
- "traefik.http.routers.<service>-router.tls.certresolver=leresolver"
|
||||
- "traefik.http.routers.<service>-router.service=<service>@swarm"
|
||||
- "traefik.http.services.<service>.loadbalancer.server.port=<port>"
|
||||
|
||||
Replace <service>, <subdomain>, and <port> for each stack.
|
||||
|
||||
⸻
|
||||
|
||||
3. Update Traefik ACME Configuration
|
||||
|
||||
In traefik.yml, use:
|
||||
|
||||
certificatesResolvers:
|
||||
leresolver:
|
||||
acme:
|
||||
email: "your-email@example.com"
|
||||
storage: "/letsencrypt/acme.json"
|
||||
dnsChallenge:
|
||||
provider: duckdns
|
||||
propagation:
|
||||
delayBeforeChecks: 60s
|
||||
resolvers:
|
||||
- "192.168.1.196:53"
|
||||
- "192.168.1.245:53"
|
||||
- "192.168.1.62:53"
|
||||
|
||||
Note: delayBeforeCheck is deprecated. Use propagation.delayBeforeChecks.
|
||||
|
||||
⸻
|
||||
|
||||
4. Internal Services Configuration
|
||||
• Redis / Postgres / other internal services
|
||||
Do not expose them via Traefik.
|
||||
Attach them to backend networks only:
|
||||
|
||||
networks:
|
||||
- homelab-backend
|
||||
|
||||
• Only web services should have Traefik labels.
|
||||
|
||||
⸻
|
||||
|
||||
5. Deploy Services Correctly
|
||||
1. Deploy Traefik first.
|
||||
2. Deploy each routed service one at a time to allow ACME certificate issuance.
|
||||
3. Verify logs for any Router defined multiple times or port is missing errors.
|
||||
|
||||
⸻
|
||||
|
||||
6. Checklist for Each Service
|
||||
|
||||
Service Hostname Port Traefik Router Name Network Notes
|
||||
example-svc example.sj98.duckdns.org 8080 example-svc-router traefik-public Replace placeholders
|
||||
another-svc another.sj98.duckdns.org 8000 another-svc-router traefik-public Only if web-exposed
|
||||
|
||||
• Fill in each service’s hostname, port, and network.
|
||||
• Internal services do not need Traefik labels.
|
||||
|
||||
⸻
|
||||
|
||||
7. Common Issues
|
||||
• Duplicate Router Names: Make sure every router has a unique label.
|
||||
• Missing Ports: Each Traefik router must reference the service port with loadbalancer.server.port.
|
||||
• ACME Failures: Ensure DuckDNS token is correct and propagation delay is set.
|
||||
• Wrong Network: Only services on traefik-public are routable; internal services must use backend networks.
|
||||
Reference in New Issue
Block a user