271 lines
10 KiB
Markdown
271 lines
10 KiB
Markdown
# HOMELAB CONFIGURATION SUMMARY — UPDATED 2025-10-31
|
||
|
||
## NETWORK INFRASTRUCTURE
|
||
Main Router: TP-Link BE9300 (2.5 Gb WAN + 4× 2.5 Gb LAN)
|
||
Secondary Router: Linksys WRT3200ACM (OpenWRT)
|
||
Managed Switch: TP-Link TL-SG608E (1 Gb)
|
||
Additional: Apple AirPort Time Capsule (192.168.1.153)
|
||
Backbone Speed: 2.5 Gb core / 1 Gb secondary
|
||
DNS Architecture: 3× Pi-hole + 3× Unbound (192.168.1.196, .245, .62) with local recursive forwarding
|
||
VPN: Tailscale (Pi 4 as exit node)
|
||
Reverse Proxy: Traefik (on .196; planned Swarm takeover)
|
||
LAN Subnet: 192.168.1.0/24
|
||
Notes: Rate-limit prevention on Pi-hole instances, Unbound local caching to accelerate DNS queries
|
||
|
||
---
|
||
|
||
## NODE OVERVIEW
|
||
|
||
192.168.1.81 — Ryzen 3700X Node
|
||
• CPU: AMD 8C/16T
|
||
• RAM: 64–80 GB Current 2 of 4 3200 32gb 4x8gb 3600 availible
|
||
• GPU: RTX 4060 Ti
|
||
• Network: 2.5 GbE onboard
|
||
• Role: Docker Swarm Worker (label=heavy)
|
||
• Function: AI compute (LM Studio, Llama.cpp, OpenWebUI, Ollama planned)
|
||
• OS: Windows 11 + WSL2 / Fedora (Dual Boot)
|
||
• Notes: Primary compute node for high-performance AI workloads. Both OS installations act as interchangeable swarm nodes with the same label.
|
||
|
||
192.168.1.57 — Acer Aspire R14 (Proxmox Host)
|
||
• CPU: Intel i5-6200U (2C/4T)
|
||
|
||
---
|
||
## NETWORK UPGRADE & VLAN
|
||
* **Switch**: Install a 2.5 Gb PoE managed switch (e.g., Netgear GS110EMX).
|
||
* **VLANs**: Create VLAN 10 for management, VLAN 20 for services. Add router ACLs to isolate traffic.
|
||
* **LACP**: Bond two NICs on the Ryzen node for 5 Gb aggregated link.
|
||
|
||
## STORAGE ENHANCEMENTS
|
||
* Deploy a dedicated NAS (e.g., Synology DS920+) with RAID‑6 and SSD cache.
|
||
* On Proxmox host, create ZFS pool `tank` on local SSDs (`zpool create tank /dev/sda /dev/sdb`).
|
||
* Mount NAS shares on all nodes (`/mnt/nas`).
|
||
* Add cron job to prune unused AI model caches.
|
||
|
||
## SERVICE CONSOLIDATION & RESILIENCE
|
||
* Convert standalone Traefik on Pi 4 to a Docker‑Swarm service with 2 replicas.
|
||
* Deploy fallback Caddy on Pi Zero with a static maintenance page.
|
||
* Add health‑check sidecars to critical containers (Portainer, OpenWebUI).
|
||
* Separate persistent volumes per stack (AI models on SSD, Nextcloud on NAS).
|
||
|
||
## SECURITY HARDENING
|
||
* Enable router firewall ACLs for inter‑VLAN traffic (allow only required ports).
|
||
* Install `fail2ban` on the manager VM.
|
||
* Restrict Portainer UI to VPN‑only access and enable 2FA/OAuth.
|
||
|
||
## MONITORING & AUTOMATION
|
||
* Deploy `node-exporter` on Proxmox host.
|
||
* Create Grafana alerts for CPU > 80 %, RAM > 85 %, disk > 80 %.
|
||
* Add Home‑Assistant backup automation to NAS.
|
||
* Integrate Tailscale metrics via `tailscale_exporter`.
|
||
|
||
## OFF‑SITE BACKUP STRATEGY
|
||
* Install `restic` on manager VM and initialise Backblaze B2 repo.
|
||
* Daily backup script (`/usr/local/bin/backup_daily.sh`) for HA config, Portainer DB, important volumes.
|
||
* Systemd timer to run at 02:00 AM.
|
||
|
||
---
|
||
• RAM: 8 GB
|
||
• Network: 2.5 GbE via USB adapter
|
||
• Role: Proxmox Host
|
||
• Function: Virtualization host for Apps VM (.196) and OMV (.70)
|
||
• Storage: Local SSDs + OMV shared volumes
|
||
• Notes: Lightweight node for VMs and containerized storage services
|
||
|
||
192.168.1.196 — Apps Manager VM (on Acer Proxmox)
|
||
CPU: 4
|
||
RAM: 4 GB min 6 GB max
|
||
• Role: Docker Swarm Manager (label=manager)
|
||
• Function: Pi-hole + Unbound + Portainer UI + Traefik reverse proxy
|
||
• Architecture: x86 (virtualized)
|
||
• Notes: Central orchestration, DNS control, and reverse proxy; Portainer agent installed for remote swarm management
|
||
|
||
192.168.1.70 — OMV Instance (on Acer)
|
||
CPU 2
|
||
RAM: 2 GB min 4 GB max
|
||
• Role: Network Attached Storage
|
||
• Function: Shared Docker volumes, media, VM backups
|
||
• Stack: OpenMediaVault 7.x
|
||
• Architecture: x86
|
||
• Planned: Receive SMB3-reshares from Time Capsule (.153)
|
||
• Storage: Docker volumes for AI models, backup directories, and media
|
||
• Notes: Central NAS for swarm and LLM storage
|
||
|
||
192.168.1.245 — Raspberry Pi 4 (8 GB)
|
||
• CPU: ARM Quad-Core
|
||
• RAM: 8 GB
|
||
• Network: 1 GbE
|
||
• Role: Docker Swarm Leader (label=leader)
|
||
• Function: Home Assistant OS + Portainer Agent + HAOS-based Unbound (via Ubuntu container)
|
||
• Standalone Services: Traefik (currently standalone), HAOS Unbound
|
||
• Notes: Central smart home automation hub; swarm leader for container orchestration; plan for Swarm Traefik to take over existing Traefik instance
|
||
|
||
192.168.1.62 — Raspberry Pi Zero 2 W
|
||
• CPU: ARM Quad-Core
|
||
• RAM: 512 MB
|
||
• Network: 100 Mb Ethernet
|
||
• Role: Docker Swarm Worker (label=light)
|
||
• Function: Lightweight DNS + Pi-hole + Unbound + auxiliary containers
|
||
• Notes: Low-power node for background jobs, DNS redundancy, and monitoring tasks
|
||
|
||
192.168.1.153 — Apple AirPort Time Capsule
|
||
• Network: 1 GbE via WRT3200ACM
|
||
• Role: Backup storage and SMB bridge
|
||
• Function: Time Machine backups (SMB1)
|
||
• Planned: Reshare SMB1 → SMB3 via OMV (.70) for modern clients
|
||
• Notes: Source for macOS backups; will integrate into OMV NAS for consolidation
|
||
|
||
---
|
||
|
||
## DOCKER SWARM CLUSTER
|
||
Leader 192.168.1.245 (Pi 4, label=leader)
|
||
Manager 192.168.1.196 (Apps VM, label=manager)
|
||
Worker (Fedora) 192.168.1.81 (Ryzen, label=heavy)
|
||
Worker (Light) 192.168.1.62 (Pi Zero 2 W, label=light)
|
||
|
||
Cluster Functions:
|
||
• Distributed container orchestration across x86 + ARM
|
||
• High-availability DNS via Pi-hole + Unbound replicas
|
||
• Unified management and reverse proxy on the manager node
|
||
• Specific workload placement using node labels (heavy, leader, manager)
|
||
• AI/ML workloads pinned to the 'heavy' node for performance
|
||
• General application services pinned to the 'leader' node
|
||
• Core services like Traefik and Portainer pinned to the 'manager' node
|
||
|
||
---
|
||
|
||
## STACKS
|
||
|
||
### Networking Stack
|
||
• **Traefik:** Reverse Proxy
|
||
• **whoami:** Service for testing Traefik
|
||
|
||
### Monitoring Stack
|
||
• **Prometheus:** Metrics collection
|
||
• **Grafana:** Metrics visualization
|
||
• **Alertmanager:** Alerting
|
||
• **Node-exporter:** Node metrics exporter
|
||
• **cAdvisor:** Container metrics exporter
|
||
|
||
### Tools Stack
|
||
• **Portainer:** Swarm Management
|
||
• **Dozzle:** Log viewing
|
||
• **Lazydocker:** Terminal UI for Docker
|
||
• **TSDProxy:** Tailscale Docker Proxy
|
||
• **Watchtower:** Container Updates
|
||
|
||
### Application Stack
|
||
• **OpenWebUI:** AI Frontend
|
||
• **Paperless-ngx:** Document Management
|
||
• **Stirling-PDF:** PDF utility
|
||
• **SearXNG:** Metasearch engine
|
||
|
||
### Productivity Stack
|
||
• **Nextcloud:** Cloud storage and collaboration
|
||
|
||
---
|
||
|
||
## SERVICES MAP
|
||
• **Manager Node (.196):**
|
||
• **Networking Stack:** Traefik
|
||
• **Monitoring Stack:** Prometheus, Grafana
|
||
• **Tools Stack:** Portainer, Dozzle, Lazydocker, TSDProxy, Watchtower
|
||
• **Leader Node (.245):**
|
||
• **Application Stack:** Paperless-ngx, Stirling-PDF, SearXNG
|
||
• **Productivity Stack:** Nextcloud
|
||
• **Heavy Worker Node (.81):**
|
||
• **Application Stack:** OpenWebUI
|
||
• **Light Worker Node (.62):**
|
||
• **Networking Stack:** whoami
|
||
• **Other Services:**
|
||
• **VPN:** Tailscale (Pi4 exit node)
|
||
• **Virtualization:** Proxmox VE (.57)
|
||
• **Storage:** OMV NAS (.70) + Time Capsule (.153)
|
||
|
||
|
||
---
|
||
|
||
## STORAGE & BACKUPS
|
||
OMV (.70) — shared Docker volumes, LLM models, media, backup directories
|
||
Time Capsule (.153) — legacy SMB1 source; planned SMB3 reshare via OMV
|
||
External SSDs/HDDs — portable compute, LLM scratch storage, media archives
|
||
Time Machine clients — macOS systems
|
||
Planned Workflow:
|
||
• Mount Time Capsule SMB1 share in OMV via CIFS
|
||
• Reshare through OMV Samba as SMB3
|
||
• Sync critical backups to OMV and external drives
|
||
• AI models stored on NVMe + OMV volumes for high-speed access
|
||
|
||
---
|
||
|
||
## PERFORMANCE STRATEGY
|
||
• 2.5 Gb backbone: Ryzen (.81) + Acer (.57) nodes
|
||
• 1 Gb nodes: Pi 4 (.245) + Time Capsule (.153)
|
||
• 100 Mb node: Pi Zero 2 W (.62)
|
||
• ARM nodes for low-power/auxiliary tasks
|
||
• x86 nodes for AI, storage, and compute-intensive containers
|
||
• Swarm resource labeling for workload isolation
|
||
• DNS redundancy and rate-limit protection
|
||
• Unified monitoring via Portainer + Home Assistant
|
||
• GPU-intensive AI containers pinned to Ryzen node for efficiency
|
||
• Traefik migration plan: standalone .245 → Swarm-managed cluster routing
|
||
|
||
---
|
||
|
||
## NOTES
|
||
• Acer Proxmox hosts OMV (.70) and Apps Manager VM (.196)
|
||
• Ryzen (.81) dedicated to AI and heavy Docker tasks
|
||
• HAOS Pi 4 (.245) leader, automation hub, and temporary standalone Traefik
|
||
• DNS load balanced among .62, .196, and .245
|
||
• Time Capsule (.153) planned SMB1→SMB3 reshare via OMV
|
||
• Network speed distribution: Ryzen/Acer = 2.5 Gb, Pi 4/Time Capsule = 1 Gb, Pi Zero 2 W = 100 Mb
|
||
• LLM models stored on high-speed NVMe on Ryzen, backed up to OMV and external drives
|
||
• No personal identifiers included in this record
|
||
|
||
# END CONFIG
|
||
|
||
---
|
||
|
||
## SMART HOME INTEGRATION
|
||
|
||
### LIGHTING & CONTROLS
|
||
• Philips Hue
|
||
- Devices: Hue remote only (no bulbs)
|
||
- Connectivity: Zigbee
|
||
- Automation: Integrated into Home Assistant OS (.245)
|
||
- Notes: Remote used to trigger HAOS scenes and routines for other smart devices
|
||
|
||
• Govee Smart Lights & Sensors
|
||
- Devices: RGB LED strips, motion sensors, temperature/humidity sensors
|
||
- Connectivity: Wi-Fi
|
||
- Automation: Home Assistant via MQTT / cloud integration
|
||
- Notes: Motion-triggered lighting and environmental monitoring
|
||
|
||
• TP-Link / Tapo Smart Devices
|
||
- Devices: Tapo lightbulbs, Kasa smart power strip
|
||
- Connectivity: Wi-Fi
|
||
- Automation: Home Assistant + Kasa/Tapo integration
|
||
- Notes: Power scheduling and energy monitoring
|
||
|
||
### AUDIO & VIDEO
|
||
• TVs: Multiple 4K Smart TVs
|
||
- Platforms: Fire Stick, Apple devices, console inputs
|
||
- Connectivity: Ethernet (1 Gb) or Wi-Fi
|
||
- Automation: HAOS scenes, volume control, source switching
|
||
|
||
• Streaming & Consoles:
|
||
- Devices: Fire Stick, PS5, Nintendo Switch
|
||
- Connectivity: Ethernet or Wi-Fi
|
||
- Notes: Automated on/off with Home Assistant, media triggers
|
||
|
||
### SECURITY & SENSORS
|
||
• Vivint Security System
|
||
- Devices: Motion detectors, door/window sensors, cameras
|
||
- Connectivity: Proprietary protocol + cloud
|
||
- Automation: Home Assistant integrations for alerts and scene triggers
|
||
|
||
• Environmental Sensors
|
||
- Devices: Govee temperature/humidity, Tapo sensors
|
||
- Connectivity: Wi-Fi
|
||
- Automation: Trigger HVAC, lights, or notifications
|
||
|