Docs: Add homelab improvement guide and update README

This commit is contained in:
2025-12-27 19:15:05 -06:00
parent f0c525d0df
commit cf360234c1
2 changed files with 83 additions and 0 deletions

View File

@@ -0,0 +1,79 @@
# Homelab Improvement Guide
This guide provides recommendations for improving the efficiency, reliability, and security of your homelab.
## 1. High Availability
Your current setup has a single point of failure for several services due to placement constraints tying them to a single node. To improve high availability, we recommend the following:
* **Remove Single-Node Constraints:** In your Docker Swarm service definitions (`applications-stack.yml`, `monitoring-stack.yml`), remove the following placement constraints:
* `node.labels.leader == true`
* `node.role == manager`
* **Replicate Services:** Increase the replica count for critical services to at least `2`. This will ensure that the services remain available if a node goes down. For example, in your `applications-stack.yml`:
```yaml
services:
paperless:
# ...
deploy:
replicas: 2
# ...
```
* **Stateful Services:** For stateful services like databases, consider the following options:
* **Distributed Database:** Use a database designed for high availability, such as Galera Cluster for MySQL or Patroni for PostgreSQL.
* **Shared Storage:** Use a shared storage solution like NFS or GlusterFS that is accessible from all nodes in the swarm.
## 2. Hardware Efficiency
* **Resource Limit Tuning:** Your current resource limits are a good starting point, but they can be optimized. Use your monitoring stack (Prometheus and Grafana) to analyze the actual resource usage of your services over time. Adjust the `limits` and `reservations` in your `docker-compose.yml` files to better match the actual usage. This will prevent over-provisioning and improve hardware utilization.
* **Node Affinity:** If you have nodes with specific hardware (e.g., GPUs), use node labels and placement constraints to schedule services on the appropriate nodes. For example:
```yaml
services:
jellyfin:
# ...
deploy:
placement:
constraints:
- node.labels.gpu == true
```
## 3. Security
* **Secret Management:**
* **Paperless Secret Key:** The `PAPERLESS_SECRET_KEY` in `applications-stack.yml` should be stored as a Docker secret.
1. Create the secret:
```bash
openssl rand -hex 32 | docker secret create paperless_secret_key -
```
2. Update your `applications-stack.yml`:
```yaml
services:
paperless:
# ...
secrets:
- paperless_secret_key
environment:
# ...
- PAPERLESS_SECRET_KEY_FILE: /run/secrets/paperless_secret_key
```
* **Backup Credentials:** The Backblaze B2 credentials in `backup_daily.sh` should be stored as Docker secrets. You can then mount these secrets into the container that runs the backup script.
* **Network Policies:** Implement Docker Swarm network policies to restrict traffic between services. This adds an extra layer of security to your homelab.
## 4. Quality of Life
* **Automated Backup Verification:** Extend your `backup_daily.sh` script to include a step that automatically verifies the integrity of your backups. `restic check` can be used for this purpose.
* **Centralized Logging:** For easier log analysis, consider setting up a centralized logging solution like the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Loki.
* **Documentation:**
* **Architecture Diagram:** Create a diagram of your network architecture and service dependencies. This will make it easier to understand and troubleshoot your homelab.
* **Update `README.md`:** Add a link to this guide in your main `README.md` file.
## 5. `tsdproxy`
* **Review Configuration:** The search results suggest that `tsdproxy` can be complex to set up in a multi-host Docker Swarm. Review your `tsdproxy` configuration to ensure it is working correctly. Check the `tsdproxy` logs for any errors.
* **Consult Documentation:** If you encounter issues, consult the official `tsdproxy` documentation and GitHub issues for troubleshooting tips.