# Home Lab Cluster Audit Report **Date:** February 9, 2026 **Auditor:** opencode **Cluster:** Docker Swarm with Dokploy --- ## 1. Cluster Overview - **Cluster Type:** Docker Swarm (3 nodes) - **Orchestration:** Dokploy v3.x - **Reverse Proxy:** Traefik v3.6.1 - **DNS:** Technitium DNS Server - **Monitoring:** Swarmpit - **Git Server:** Gitea v1.24.4 - **Object Storage:** MinIO --- ## 2. Node Inventory ### Node 1: tpi-n1 (Controller/Manager) - **IP:** 192.168.2.130 - **Role:** Manager (Leader) - **Architecture:** aarch64 (ARM64) - **OS:** Linux - **CPU:** 8 cores - **RAM:** ~8 GB - **Docker:** v27.5.1 - **Labels:** - `infra=true` - `role=storage` - `storage=high` - **Status:** Ready, Active ### Node 2: tpi-n2 (Worker) - **IP:** 192.168.2.19 - **Role:** Worker - **Architecture:** aarch64 (ARM64) - **OS:** Linux - **CPU:** 8 cores - **RAM:** ~8 GB - **Docker:** v27.5.1 - **Labels:** - `role=compute` - **Status:** Ready, Active ### Node 3: node-nas (Storage Worker) - **IP:** 192.168.2.18 - **Role:** Worker (NAS/Storage) - **Architecture:** x86_64 - **OS:** Linux - **CPU:** 2 cores - **RAM:** ~8 GB - **Docker:** v29.1.2 - **Labels:** - `type=nas` - **Status:** Ready, Active --- ## 3. Docker Stacks (Swarm Mode) ### Active Stacks: #### 1. minio - **Services:** 1 (minio_minio) - **Status:** Running - **Node:** node-nas (constrained to NAS) - **Ports:** 9000 (API), 9001 (Console) - **Storage:** /mnt/synology-data/minio (bind mount) - **Credentials:** [REDACTED - see service config] #### 2. swarmpit - **Services:** 4 - swarmpit_app (UI) - Running on tpi-n1, Port 888 - swarmpit_agent (global) - Running on all 3 nodes - swarmpit_db (CouchDB) - Running on tpi-n2 - swarmpit_influxdb - Running on node-nas - **Status:** Active with historical failures - **Issues:** Multiple container failures in history (mostly resolved) --- ## 4. Dokploy-Managed Services ### Running Services (via Dokploy Compose): 1. **ai-lobechat-yqvecg** - AI chat interface 2. **bewcloud-memos-ssogxn** - Note-taking app (⚠️ Restarting loop) 3. **bewcloud-silverbullet-42sjev** - SilverBullet markdown editor + Watchtower 4. **cloud-bewcloud-u2pls5** - BewCloud instance with Radicale (CalDAV/CardDAV) 5. **cloud-fizzy-ezuhfq** - Fizzy web app 6. **cloud-ironcalc-0id5k8** - IronCalc spreadsheet 7. **cloud-radicale-wqldcv** - Standalone Radicale server 8. **cloud-uptimekuma-jdeivt** - Uptime monitoring 9. **dns-technitum-6ojgo2** - Technitium DNS server 10. **gitea-giteasqlite-bhymqw** - Git server (Port 3000, SSH on 2222) 11. **gitea-registry-vdftrt** - Docker registry (Port 5000) ### Dokploy Infrastructure Services: - **dokploy** - Main Dokploy UI (Port 3000, host mode) - **dokploy-postgres** - Dokploy database - **dokploy-redis** - Dokploy cache - **dokploy-traefik** - Reverse proxy (Ports 80, 443, 8080) --- ## 5. Standalone Services (docker-compose) ### Running: - **technitium-dns** - DNS server (Port 53, 5380) - **immich3-compose** - Photo management (Immich v2.3.0) - immich-server - immich-machine-learning - immich-database (pgvecto-rs) - immich-redis ### Stack Services: - **bendtstudio-pancake-bzgfpc** - MariaDB database (Port 3306) - **bendtstudio-webstatic-iq9evl** - Static web files (⚠️ Rollback paused state) --- ## 6. Issues Identified ### 🔴 Critical Issues: 1. **bewcloud-memos in Restart Loop** - Container keeps restarting (seen 24 seconds ago) - Status: `Restarting (0) 24 seconds ago` - **Action Required:** Check logs and fix configuration 2. **bendtstudio-webstatic in Rollback Paused State** - Service is not updating properly - State: `rollback_paused` - **Action Required:** Investigate update failure 3. **bendtstudio-app Not Running** - Service has 0/0 replicas - **Action Required:** Determine if needed or remove 4. **syncthing Stopped** - Service has 0 replicas - Should be on node-nas - **Action Required:** Restart or remove if not needed ### 🟡 Warning Issues: 5. **Swarmpit Agent Failures (Historical)** - Multiple past failures on all nodes - Currently running but concerning history - **Action Required:** Monitor for stability 6. **No Monitoring of MinIO** - MinIO running but no obvious backup/monitoring strategy documented - **Action Required:** Set up monitoring and backup 7. **Credential Management** - Passwords visible in service configs (bendtstudio-webstatic, MinIO, DNS) - **Action Required:** Migrate to Docker secrets or env files ### 🟢 Informational: 8. **13 Unused/Orphaned Volumes** - 33 total volumes, only 20 active - **Action Required:** Clean up unused volumes to reclaim ~595MB 9. **Gitea Repository Status Unknown** - Cannot verify if all compose files are version controlled - **Action Required:** Audit Gitea repositories --- ## 7. Storage Configuration ### Local Volumes (33 total): Key volumes include: - `dokploy-postgres-database` - `bewcloud-postgres-in40hh-data` - `gitea-data`, `gitea-registry-data` - `immich-postgres`, `immich-redis-data`, `immich-model-cache` - `bendtstudio-pancake-data` - `shared-data` (NFS/shared) - Various app-specific volumes ### Bind Mounts: - **MinIO:** `/mnt/synology-data/minio` → `/data` - **Syncthing:** `/mnt/synology-data` → `/var/syncthing` (currently stopped) - **Dokploy:** `/etc/dokploy` → `/etc/dokploy` ### NFS Mounts: - Synology NAS mounted at `/mnt/synology-data/` - Contains: immich/, minio/ --- ## 8. Networking ### Overlay Networks: - `dokploy-network` - Main Dokploy network - `minio_default` - MinIO stack network - `swarmpit_net` - Swarmpit monitoring network - `ingress` - Docker Swarm ingress ### Bridge Networks: - Multiple app-specific networks created by compose - `ai-lobechat-yqvecg` - `bewcloud-memos-ssogxn` - `bewcloud-silverbullet-42sjev` - `cloud-fizzy-ezuhfq_default` - `cloud-uptimekuma-jdeivt` - `gitea-giteasqlite-bhymqw` - `gitea-registry-vdftrt` - `immich3-compose-ubyhe9_default` --- ## 9. SSL/TLS Configuration - **Certificate Resolver:** Let's Encrypt (ACME) - **Email:** sirtimbly@gmail.com - **Challenge Type:** HTTP-01 - **Storage:** `/etc/dokploy/traefik/dynamic/acme.json` - **Entry Points:** web (80) → websecure (443) with auto-redirect - **HTTP/3:** Enabled on websecure --- ## 10. Traefik Routing ### Configured Routes (via labels): - gitea.bendtstudio.com → Gitea - Multiple apps via traefik.me subdomains - HTTP → HTTPS redirect enabled - Middlewares configured in `/etc/dokploy/traefik/dynamic/` --- ## 11. DNS Configuration ### Technitium DNS: - **Port:** 53 (TCP/UDP), 5380 (Web UI) - **Domain:** dns.bendtstudio.com - **Admin Password:** [REDACTED] - **Placement:** Locked to tpi-n1 - **TZ:** America/New_York ### Services using DNS: - All services accessible via bendtstudio.com subdomains - Internal DNS resolution for Docker services --- ## 12. Configuration Files Location ### In `/etc/dokploy/`: - `traefik/traefik.yml` - Main Traefik config - `traefik/dynamic/*.yml` - Dynamic routes and middlewares - `compose/*/code/docker-compose.yml` - Dokploy-managed compose files ### In `/home/ubuntu/`: - `minio-stack.yml` - MinIO stack definition ### In local workspace: - Various compose files (not all deployed via Dokploy) - May be out of sync with running services --- ## 13. Missing Configuration in Version Control Based on the analysis, the following may NOT be properly tracked in Gitea: 1. ✅ **Gitea** itself - compose file present 2. ✅ **MinIO** - stack file in ~/minio-stack.yml 3. ⚠️ **Dokploy dynamic configs** - traefik routes 4. ⚠️ **All Dokploy-managed compose files** - 11 services 5. ❌ **Technitium DNS** - compose file in /etc/dokploy/ 6. ❌ **Immich** - compose configuration 7. ❌ **Swarmpit** - stack configuration 8. ❌ **Dokploy infrastructure** - internal services --- ## 14. Resource Usage ### Docker System: - **Images:** 23 (10.91 GB) - **Containers:** 26 (135 MB) - **Volumes:** 33 (2.02 GB, 595MB reclaimable) - **Build Cache:** 0 ### Node Resources: - **tpi-n1 & tpi-n2:** 8 cores ARM64, 8GB RAM each - **node-nas:** 2 cores x86_64, 8GB RAM --- ## 15. Recommendations ### Immediate Actions (High Priority): 1. **Fix bewcloud-memos** ```bash docker service logs bewcloud-memos-ssogxn-memos --tail 50 ``` 2. **Fix bendtstudio-webstatic** ```bash docker service ps bendtstudio-webstatic-iq9evl --no-trunc docker service update --force bendtstudio-webstatic-iq9evl ``` 3. **Restart or Remove syncthing** ```bash # Option 1: Scale up docker service scale syncthing=1 # Option 2: Remove docker service rm syncthing ``` 4. **Clean up unused volumes** ```bash docker volume prune ``` ### Short-term Actions (Medium Priority): 5. **Audit Gitea repositories** - Access Gitea at http://gitea.bendtstudio.com - Verify which compose files are tracked - Commit missing configurations 6. **Secure credentials** - Use Docker secrets for passwords - Move credentials to environment files - Never commit .env files with real passwords 7. **Set up automated backups** - Back up Dokploy database - Back up Gitea repositories - Back up MinIO data 8. **Document all services** - Create README for each service - Document dependencies and data locations - Create runbook for common operations ### Long-term Actions (Low Priority): 9. **Implement proper monitoring** - Prometheus/Grafana for metrics (mentioned in PLAN.md but not found) - Alerting for service failures - Disk usage monitoring 10. **Implement GitOps workflow** - All changes through Git - Automated deployments via Dokploy webhooks - Configuration drift detection 11. **Consolidate storage strategy** - Define clear policy for volumes vs bind mounts - Document backup procedures for each storage type 12. **Security audit** - Review all exposed ports - Check for default/weak passwords - Implement network segmentation if needed --- ## 16. Next Steps Checklist - [ ] Fix critical service issues (memos, webstatic) - [ ] Document all running services with purpose - [ ] Commit all compose files to Gitea - [ ] Create backup strategy - [ ] Set up monitoring and alerting - [ ] Clean up unused resources - [ ] Create disaster recovery plan - [ ] Document SSH access for all nodes --- ## Appendix A: Quick Commands Reference ```bash # View cluster status docker node ls docker service ls docker stack ls # View service logs docker service logs --tail 100 -f # View container logs docker logs --tail 100 -f # Scale a service docker service scale = # Update a service docker service update --force # SSH to nodes ssh -i ~/.ssh/id_ed25519 ubuntu@192.168.2.130 # tpi-n1 (manager) ssh -i ~/.ssh/id_ed25519 ubuntu@192.168.2.19 # tpi-n2 (worker) # NAS node requires different credentials # Access Dokploy UI http://192.168.2.130:3000 # Access Swarmpit UI http://192.168.2.130:888 # Access Traefik Dashboard http://192.168.2.130:8080 ``` --- *End of Audit Report*