From cf360234c154a9a03e2fe07e55eb7eaf53b41598 Mon Sep 17 00:00:00 2001 From: Sterlen Date: Sat, 27 Dec 2025 19:15:05 -0600 Subject: [PATCH] Docs: Add homelab improvement guide and update README --- README.md | 4 ++ docs/guides/IMPROVEMENT_GUIDE.md | 79 ++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+) create mode 100644 docs/guides/IMPROVEMENT_GUIDE.md diff --git a/README.md b/README.md index a036838..4af5325 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,10 @@ A complete implementation plan for upgrading a home lab infrastructure with focu - Comprehensive monitoring - Automated backups +## 💡 Homelab Improvement Guide + +For recommendations on how to improve the efficiency, reliability, and security of your homelab, please see the [Homelab Improvement Guide](./docs/guides/IMPROVEMENT_GUIDE.md). + ## 🗂️ Repository Structure ``` diff --git a/docs/guides/IMPROVEMENT_GUIDE.md b/docs/guides/IMPROVEMENT_GUIDE.md new file mode 100644 index 0000000..cecb004 --- /dev/null +++ b/docs/guides/IMPROVEMENT_GUIDE.md @@ -0,0 +1,79 @@ +# Homelab Improvement Guide + +This guide provides recommendations for improving the efficiency, reliability, and security of your homelab. + +## 1. High Availability + +Your current setup has a single point of failure for several services due to placement constraints tying them to a single node. To improve high availability, we recommend the following: + +* **Remove Single-Node Constraints:** In your Docker Swarm service definitions (`applications-stack.yml`, `monitoring-stack.yml`), remove the following placement constraints: + * `node.labels.leader == true` + * `node.role == manager` +* **Replicate Services:** Increase the replica count for critical services to at least `2`. This will ensure that the services remain available if a node goes down. For example, in your `applications-stack.yml`: + + ```yaml + services: + paperless: + # ... + deploy: + replicas: 2 + # ... + ``` + +* **Stateful Services:** For stateful services like databases, consider the following options: + * **Distributed Database:** Use a database designed for high availability, such as Galera Cluster for MySQL or Patroni for PostgreSQL. + * **Shared Storage:** Use a shared storage solution like NFS or GlusterFS that is accessible from all nodes in the swarm. + +## 2. Hardware Efficiency + +* **Resource Limit Tuning:** Your current resource limits are a good starting point, but they can be optimized. Use your monitoring stack (Prometheus and Grafana) to analyze the actual resource usage of your services over time. Adjust the `limits` and `reservations` in your `docker-compose.yml` files to better match the actual usage. This will prevent over-provisioning and improve hardware utilization. + +* **Node Affinity:** If you have nodes with specific hardware (e.g., GPUs), use node labels and placement constraints to schedule services on the appropriate nodes. For example: + + ```yaml + services: + jellyfin: + # ... + deploy: + placement: + constraints: + - node.labels.gpu == true + ``` + +## 3. Security + +* **Secret Management:** + * **Paperless Secret Key:** The `PAPERLESS_SECRET_KEY` in `applications-stack.yml` should be stored as a Docker secret. + 1. Create the secret: + ```bash + openssl rand -hex 32 | docker secret create paperless_secret_key - + ``` + 2. Update your `applications-stack.yml`: + ```yaml + services: + paperless: + # ... + secrets: + - paperless_secret_key + environment: + # ... + - PAPERLESS_SECRET_KEY_FILE: /run/secrets/paperless_secret_key + ``` + * **Backup Credentials:** The Backblaze B2 credentials in `backup_daily.sh` should be stored as Docker secrets. You can then mount these secrets into the container that runs the backup script. + +* **Network Policies:** Implement Docker Swarm network policies to restrict traffic between services. This adds an extra layer of security to your homelab. + +## 4. Quality of Life + +* **Automated Backup Verification:** Extend your `backup_daily.sh` script to include a step that automatically verifies the integrity of your backups. `restic check` can be used for this purpose. + +* **Centralized Logging:** For easier log analysis, consider setting up a centralized logging solution like the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Loki. + +* **Documentation:** + * **Architecture Diagram:** Create a diagram of your network architecture and service dependencies. This will make it easier to understand and troubleshoot your homelab. + * **Update `README.md`:** Add a link to this guide in your main `README.md` file. + +## 5. `tsdproxy` + +* **Review Configuration:** The search results suggest that `tsdproxy` can be complex to set up in a multi-host Docker Swarm. Review your `tsdproxy` configuration to ensure it is working correctly. Check the `tsdproxy` logs for any errors. +* **Consult Documentation:** If you encounter issues, consult the official `tsdproxy` documentation and GitHub issues for troubleshooting tips.