Files
Homelab/docs/guides/traefik_fix_guide.md

8.4 KiB
Raw Permalink Blame History

Final Traefik v3 Setup and Fix Guide

This guide provides the complete, step-by-step process to cleanly remove any old Traefik configurations and deploy a fresh, working Traefik v3 setup on Docker Swarm.

Follow these steps in order on your Docker Swarm manager node.


Step 1: Complete Removal of Old Traefik Components

First, we will ensure the environment is completely clean.

  1. Remove the Stack:

    • In Portainer, go to "Stacks", select your networking-stack, and click Remove. Wait for it to be successfully removed.
  2. Remove the Docker Config:

    • Run this command in your manager node's terminal:
    docker config rm traefik.yml
    

    (It's okay if this command says the config doesn't exist.)

  3. Remove the Docker Volume:

    • This will delete your old Let's Encrypt certificates, which is necessary for a clean start.
    docker volume rm traefik_letsencrypt
    

    (It's okay if this command says the volume doesn't exist.)

  4. Remove the Local Config File (if it exists):

    rm ./traefik.yml
    

Step 2: Create the Correct Traefik v3 Configuration

We will use the busybox container method to create the configuration file.

  1. Create traefik.yml:

    • IMPORTANT: Replace your-email@example.com with your actual email address in the block below.
    • Copy the entire multi-line block and paste it into your Zsh terminal.
    • After pasting, the terminal will show a > on a new line. This is normal. Simply type EOF and press Enter to finish the command.
    # --- Creates the traefik.yml file in a temporary container and copies it out ---
    docker run --rm -i -v "$(pwd):/host" busybox sh -c 'cat > /host/traefik.yml <<\'EOF\'
    

checkNewVersion: true sendAnonymousUsage: false

log: level: INFO

api: dashboard: true insecure: false

entryPoints: web: address: ":80" http: redirections: entryPoint: to: websecure scheme: https websecure: address: ":443" http: tls: certResolver: leresolver

providers: swarm: # <-- Use the swarm provider in Traefik v3 endpoint: "unix:///var/run/docker.sock" network: traefik-public exposedByDefault: false

Optionally keep the docker provider if you run non-swarm local containers.

docker:

network: traefik-public

exposedByDefault: false

certificatesResolvers: leresolver: acme: email: "your-email@example.com" storage: "/letsencrypt/acme.json" dnsChallenge: provider: duckdns delayBeforeCheck: 30s resolvers: - "192.168.1.196:53" - "192.168.1.245:53" - "192.168.1.62:53" EOF' ```

  1. Create the Docker Swarm Config:

    • This command ingests the file you just created into Swarm.
    docker config create traefik.yml ./traefik.yml
    
  2. Create and Prepare the Let's Encrypt Volume:

    • Create the volume:
    docker volume create traefik_letsencrypt
    
    • Create the empty acme.json file with the correct permissions:
    docker run --rm -v traefik_letsencrypt:/letsencrypt busybox sh -c "touch /letsencrypt/acme.json && chmod 600 /letsencrypt/acme.json"
    

Step 3: Deploy the Corrected networking-stack

  1. Deploy via Portainer:

    • Go to "Stacks" > "Add stack".
    • Name it networking-stack.
    • Copy the YAML content below and paste it into the web editor.
    • IMPORTANT: Replace YOUR_DUCKDNS_TOKEN with your actual DuckDNS token.
    • Click "Deploy the stack".
    version: '3.9'
    
    networks:
      traefik-public:
        external: true
    
    volumes:
      traefik_letsencrypt:
        external: true
    
    configs:
      traefik_yml:
        external: true
        name: traefik.yml
    
    services:
      traefik:
        image: traefik:latest # Or pin to traefik:v3.0 for stability
        ports:
          - "80:80"
          - "443:443"
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock
          - traefik_letsencrypt:/letsencrypt
        networks:
          - traefik-public
        environment:
          - "DUCKDNS_TOKEN=YOUR_DUCKDNS_TOKEN"
        configs:
          - source: traefik_yml
            target: /traefik.yml
        deploy:
          labels:
            - "traefik.enable=true"
            - "traefik.http.routers.traefik.rule=Host(`traefik.sj98.duckdns.org`)"
            - "traefik.http.routers.traefik.entrypoints=websecure"
            - "traefik.http.routers.traefik.tls.certresolver=leresolver"
            - "traefik.http.routers.traefik.service=api@internal"
          placement:
            constraints:
              - node.role == manager
    
      whoami:
        image: traefik/whoami
        networks:
          - traefik-public
        deploy:
          labels:
            - "traefik.enable=true"
            - "traefik.http.routers.whoami.rule=Host(`whoami.sj98.duckdns.org`)"
            - "traefik.http.routers.whoami.entrypoints=websecure"
            - "traefik.http.routers.whoami.tls.certresolver=leresolver"
            - "traefik.http.services.whoami.loadbalancer.server.port=80"
    

Step 4: Verify and Redeploy Other Stacks

  1. Wait and Verify:

    • Wait for 2-3 minutes for the stack to deploy and for the certificate to be issued.
    • Open your browser and navigate to https://traefik.sj98.duckdns.org. The Traefik dashboard should load.
    • You should see routers for traefik and whoami.
  2. Redeploy Corrected Stacks:

    • Now that Traefik is working, go to Portainer and redeploy your full-stack-complete.yml and monitoring-stack.yml to apply the fixes we made earlier.
    • The services from those stacks (Paperless, Prometheus, etc.) should now appear in the Traefik dashboard and be accessible via their URLs.

Chat GPT Fix

Traefik Swarm Stack Fix Instructions

  1. Verify Networks

Make sure all web-exposed services are attached to the traefik-public network:

networks:

  • traefik-public

Internal-only services (DB, Redis, etc.) should not be on Traefik network.

  1. Assign Unique Router Names

Every service exposed via Traefik must have a unique router label:

labels:

  • "traefik.enable=true"
  • "traefik.http.routers.-router.rule=Host(<subdomain>.sj98.duckdns.org)"
  • "traefik.http.routers.-router.entrypoints=websecure"
  • "traefik.http.routers.-router.tls.certresolver=leresolver"
  • "traefik.http.routers.-router.service=@swarm"
  • "traefik.http.services..loadbalancer.server.port="

Replace , , and for each stack.

  1. Update Traefik ACME Configuration

In traefik.yml, use:

certificatesResolvers: leresolver: acme: email: "your-email@example.com" storage: "/letsencrypt/acme.json" dnsChallenge: provider: duckdns propagation: delayBeforeChecks: 60s resolvers: - "192.168.1.196:53" - "192.168.1.245:53" - "192.168.1.62:53"

Note: delayBeforeCheck is deprecated. Use propagation.delayBeforeChecks.

  1. Internal Services Configuration • Redis / Postgres / other internal services Do not expose them via Traefik. Attach them to backend networks only:

networks:

  • homelab-backend

    • Only web services should have Traefik labels.

  1. Deploy Services Correctly
    1. Deploy Traefik first.
    2. Deploy each routed service one at a time to allow ACME certificate issuance.
    3. Verify logs for any Router defined multiple times or port is missing errors.

  1. Checklist for Each Service

Service Hostname Port Traefik Router Name Network Notes example-svc example.sj98.duckdns.org 8080 example-svc-router traefik-public Replace placeholders another-svc another.sj98.duckdns.org 8000 another-svc-router traefik-public Only if web-exposed

•	Fill in each services hostname, port, and network.
•	Internal services do not need Traefik labels.

  1. Common Issues • Duplicate Router Names: Make sure every router has a unique label. • Missing Ports: Each Traefik router must reference the service port with loadbalancer.server.port. • ACME Failures: Ensure DuckDNS token is correct and propagation delay is set. • Wrong Network: Only services on traefik-public are routable; internal services must use backend networks.