Files
Homelab/docs/guides/Homelab.md

10 KiB
Raw Permalink Blame History

HOMELAB CONFIGURATION SUMMARY — UPDATED 2025-10-31

NETWORK INFRASTRUCTURE

Main Router: TP-Link BE9300 (2.5 Gb WAN + 4× 2.5 Gb LAN) Secondary Router: Linksys WRT3200ACM (OpenWRT) Managed Switch: TP-Link TL-SG608E (1 Gb) Additional: Apple AirPort Time Capsule (192.168.1.153) Backbone Speed: 2.5 Gb core / 1 Gb secondary DNS Architecture: 3× Pi-hole + 3× Unbound (192.168.1.196, .245, .62) with local recursive forwarding VPN: Tailscale (Pi 4 as exit node) Reverse Proxy: Traefik (on .196; planned Swarm takeover) LAN Subnet: 192.168.1.0/24 Notes: Rate-limit prevention on Pi-hole instances, Unbound local caching to accelerate DNS queries


NODE OVERVIEW

192.168.1.81 — Ryzen 3700X Node • CPU: AMD 8C/16T • RAM: 6480 GB Current 2 of 4 3200 32gb 4x8gb 3600 availible • GPU: RTX 4060 Ti • Network: 2.5 GbE onboard • Role: Docker Swarm Worker (label=heavy) • Function: AI compute (LM Studio, Llama.cpp, OpenWebUI, Ollama planned) • OS: Windows 11 + WSL2 / Fedora (Dual Boot) • Notes: Primary compute node for high-performance AI workloads. Both OS installations act as interchangeable swarm nodes with the same label.

192.168.1.57 — Acer Aspire R14 (Proxmox Host) • CPU: Intel i5-6200U (2C/4T)


NETWORK UPGRADE & VLAN

  • Switch: Install a 2.5Gb PoE managed switch (e.g., Netgear GS110EMX).
  • VLANs: Create VLAN10 for management, VLAN20 for services. Add router ACLs to isolate traffic.
  • LACP: Bond two NICs on the Ryzen node for 5Gb aggregated link.

STORAGE ENHANCEMENTS

  • Deploy a dedicated NAS (e.g., Synology DS920+) with RAID6 and SSD cache.
  • On Proxmox host, create ZFS pool tank on local SSDs (zpool create tank /dev/sda /dev/sdb).
  • Mount NAS shares on all nodes (/mnt/nas).
  • Add cron job to prune unused AI model caches.

SERVICE CONSOLIDATION & RESILIENCE

  • Convert standalone Traefik on Pi4 to a DockerSwarm service with 2 replicas.
  • Deploy fallback Caddy on PiZero with a static maintenance page.
  • Add healthcheck sidecars to critical containers (Portainer, OpenWebUI).
  • Separate persistent volumes per stack (AI models on SSD, Nextcloud on NAS).

SECURITY HARDENING

  • Enable router firewall ACLs for interVLAN traffic (allow only required ports).
  • Install fail2ban on the manager VM.
  • Restrict Portainer UI to VPNonly access and enable 2FA/OAuth.

MONITORING & AUTOMATION

  • Deploy node-exporter on Proxmox host.
  • Create Grafana alerts for CPU >80%, RAM >85%, disk >80%.
  • Add HomeAssistant backup automation to NAS.
  • Integrate Tailscale metrics via tailscale_exporter.

OFFSITE BACKUP STRATEGY

  • Install restic on manager VM and initialise Backblaze B2 repo.
  • Daily backup script (/usr/local/bin/backup_daily.sh) for HA config, Portainer DB, important volumes.
  • Systemd timer to run at 02:00AM.

• RAM: 8 GB • Network: 2.5 GbE via USB adapter • Role: Proxmox Host • Function: Virtualization host for Apps VM (.196) and OMV (.70) • Storage: Local SSDs + OMV shared volumes • Notes: Lightweight node for VMs and containerized storage services

192.168.1.196 — Apps Manager VM (on Acer Proxmox) CPU: 4 RAM: 4 GB min 6 GB max • Role: Docker Swarm Manager (label=manager) • Function: Pi-hole + Unbound + Portainer UI + Traefik reverse proxy • Architecture: x86 (virtualized) • Notes: Central orchestration, DNS control, and reverse proxy; Portainer agent installed for remote swarm management

192.168.1.70 — OMV Instance (on Acer) CPU 2 RAM: 2 GB min 4 GB max • Role: Network Attached Storage • Function: Shared Docker volumes, media, VM backups • Stack: OpenMediaVault 7.x • Architecture: x86 • Planned: Receive SMB3-reshares from Time Capsule (.153) • Storage: Docker volumes for AI models, backup directories, and media • Notes: Central NAS for swarm and LLM storage

192.168.1.245 — Raspberry Pi 4 (8 GB) • CPU: ARM Quad-Core • RAM: 8 GB • Network: 1 GbE • Role: Docker Swarm Leader (label=leader) • Function: Home Assistant OS + Portainer Agent + HAOS-based Unbound (via Ubuntu container) • Standalone Services: Traefik (currently standalone), HAOS Unbound • Notes: Central smart home automation hub; swarm leader for container orchestration; plan for Swarm Traefik to take over existing Traefik instance

192.168.1.62 — Raspberry Pi Zero 2 W • CPU: ARM Quad-Core • RAM: 512 MB • Network: 100 Mb Ethernet • Role: Docker Swarm Worker (label=light) • Function: Lightweight DNS + Pi-hole + Unbound + auxiliary containers • Notes: Low-power node for background jobs, DNS redundancy, and monitoring tasks

192.168.1.153 — Apple AirPort Time Capsule • Network: 1 GbE via WRT3200ACM • Role: Backup storage and SMB bridge • Function: Time Machine backups (SMB1) • Planned: Reshare SMB1 → SMB3 via OMV (.70) for modern clients • Notes: Source for macOS backups; will integrate into OMV NAS for consolidation


DOCKER SWARM CLUSTER

Leader 192.168.1.245 (Pi 4, label=leader)
Manager 192.168.1.196 (Apps VM, label=manager)
Worker (Fedora) 192.168.1.81 (Ryzen, label=heavy)
Worker (Light) 192.168.1.62 (Pi Zero 2 W, label=light)

Cluster Functions: • Distributed container orchestration across x86 + ARM • High-availability DNS via Pi-hole + Unbound replicas • Unified management and reverse proxy on the manager node • Specific workload placement using node labels (heavy, leader, manager) • AI/ML workloads pinned to the 'heavy' node for performance • General application services pinned to the 'leader' node • Core services like Traefik and Portainer pinned to the 'manager' node


STACKS

Networking Stack

Traefik: Reverse Proxy • whoami: Service for testing Traefik

Monitoring Stack

Prometheus: Metrics collection • Grafana: Metrics visualization • Alertmanager: Alerting • Node-exporter: Node metrics exporter • cAdvisor: Container metrics exporter

Tools Stack

Portainer: Swarm Management • Dozzle: Log viewing • Lazydocker: Terminal UI for Docker • TSDProxy: Tailscale Docker Proxy • Watchtower: Container Updates

Application Stack

OpenWebUI: AI Frontend • Paperless-ngx: Document Management • Stirling-PDF: PDF utility • SearXNG: Metasearch engine

Productivity Stack

Nextcloud: Cloud storage and collaboration


SERVICES MAP

Manager Node (.196):Networking Stack: Traefik • Monitoring Stack: Prometheus, Grafana • Tools Stack: Portainer, Dozzle, Lazydocker, TSDProxy, Watchtower • Leader Node (.245):Application Stack: Paperless-ngx, Stirling-PDF, SearXNG • Productivity Stack: Nextcloud • Heavy Worker Node (.81):Application Stack: OpenWebUI • Light Worker Node (.62):Networking Stack: whoami • Other Services:VPN: Tailscale (Pi4 exit node) • Virtualization: Proxmox VE (.57) • Storage: OMV NAS (.70) + Time Capsule (.153)


STORAGE & BACKUPS

OMV (.70) — shared Docker volumes, LLM models, media, backup directories
Time Capsule (.153) — legacy SMB1 source; planned SMB3 reshare via OMV
External SSDs/HDDs — portable compute, LLM scratch storage, media archives
Time Machine clients — macOS systems
Planned Workflow: • Mount Time Capsule SMB1 share in OMV via CIFS • Reshare through OMV Samba as SMB3 • Sync critical backups to OMV and external drives • AI models stored on NVMe + OMV volumes for high-speed access


PERFORMANCE STRATEGY

• 2.5 Gb backbone: Ryzen (.81) + Acer (.57) nodes
• 1 Gb nodes: Pi 4 (.245) + Time Capsule (.153)
• 100 Mb node: Pi Zero 2 W (.62)
• ARM nodes for low-power/auxiliary tasks
• x86 nodes for AI, storage, and compute-intensive containers
• Swarm resource labeling for workload isolation
• DNS redundancy and rate-limit protection
• Unified monitoring via Portainer + Home Assistant • GPU-intensive AI containers pinned to Ryzen node for efficiency
• Traefik migration plan: standalone .245 → Swarm-managed cluster routing


NOTES

• Acer Proxmox hosts OMV (.70) and Apps Manager VM (.196)
• Ryzen (.81) dedicated to AI and heavy Docker tasks
• HAOS Pi 4 (.245) leader, automation hub, and temporary standalone Traefik
• DNS load balanced among .62, .196, and .245
• Time Capsule (.153) planned SMB1→SMB3 reshare via OMV
• Network speed distribution: Ryzen/Acer = 2.5 Gb, Pi 4/Time Capsule = 1 Gb, Pi Zero 2 W = 100 Mb
• LLM models stored on high-speed NVMe on Ryzen, backed up to OMV and external drives
• No personal identifiers included in this record

END CONFIG


SMART HOME INTEGRATION

LIGHTING & CONTROLS

• Philips Hue

  • Devices: Hue remote only (no bulbs)
  • Connectivity: Zigbee
  • Automation: Integrated into Home Assistant OS (.245)
  • Notes: Remote used to trigger HAOS scenes and routines for other smart devices

• Govee Smart Lights & Sensors

  • Devices: RGB LED strips, motion sensors, temperature/humidity sensors
  • Connectivity: Wi-Fi
  • Automation: Home Assistant via MQTT / cloud integration
  • Notes: Motion-triggered lighting and environmental monitoring

• TP-Link / Tapo Smart Devices

  • Devices: Tapo lightbulbs, Kasa smart power strip
  • Connectivity: Wi-Fi
  • Automation: Home Assistant + Kasa/Tapo integration
  • Notes: Power scheduling and energy monitoring

AUDIO & VIDEO

• TVs: Multiple 4K Smart TVs

  • Platforms: Fire Stick, Apple devices, console inputs
  • Connectivity: Ethernet (1 Gb) or Wi-Fi
  • Automation: HAOS scenes, volume control, source switching

• Streaming & Consoles:

  • Devices: Fire Stick, PS5, Nintendo Switch
  • Connectivity: Ethernet or Wi-Fi
  • Notes: Automated on/off with Home Assistant, media triggers

SECURITY & SENSORS

• Vivint Security System

  • Devices: Motion detectors, door/window sensors, cameras
  • Connectivity: Proprietary protocol + cloud
  • Automation: Home Assistant integrations for alerts and scene triggers

• Environmental Sensors

  • Devices: Govee temperature/humidity, Tapo sensors
  • Connectivity: Wi-Fi
  • Automation: Trigger HVAC, lights, or notifications