# Homelab Infrastructure & Deployment Notes ## Overview This document contains comprehensive notes about the homelab Docker Swarm cluster setup, troubleshooting procedures, and deployment configurations. ## Infrastructure Architecture ### Host Configuration - **Main Node**: `tpi-n1` (192.168.2.130) - Docker Swarm Manager - **Worker Node**: `tpi-n2` - Docker Swarm Worker - **Storage**: - Main drive: `/dev/mmcblk0p2` (29GB) - System and applications - NVMe drive: `/mnt/nvme` (916GB) - Docker data storage ### Docker Configuration - **Data Directory**: `/mnt/nvme/docker` (moved from `/var/lib/docker`) - **DNS Configuration**: Local dnsmasq forwarder at 127.0.0.1 - **External DNS**: 8.8.8.8, 8.8.4.4, 1.1.1.1 - **Docker Daemon Config**: `/etc/docker/daemon.json` ### Services Running - **Traefik**: Load balancer and SSL termination - **Dokploy**: Deployment management (port 3000) - **Gitea**: Git server (port 2222 for SSH) - **Swarmpit**: Docker Swarm management UI (port 888) - **Bendtstudio**: Main web application (5 replicas) - **MariaDB**: Database for Pancake application ## Major Maintenance Tasks ### 1. Docker Data Migration (Completed ✅) **Problem**: Main drive 100% full (28G/29G used) **Solution**: Moved 19GB Docker data to NVMe drive **Commands Used**: ```bash # Stop Docker services sudo systemctl stop docker docker.socket # Move data to NVMe sudo cp -a /var/lib/docker /mnt/nvme/ # Update Docker config echo '{"data-root": "/mnt/nvme/docker"}' | sudo tee /etc/docker/daemon.json # Restart Docker sudo systemctl start docker ``` **Result**: - Freed 15GB on main drive (100% → 46% usage) - Docker data on fast NVMe storage - All services maintained without downtime ### 2. Nginx Configuration Fix (Completed ✅) **Problem**: Pancake static assets returning 404, pretty URLs not working **Root Cause**: Apache .htaccess rules not translated to nginx properly **Files Modified**: - `nginx.template.conf` - Main configuration template - All running containers - Updated nginx configuration **Key Changes**: ```nginx # Fixed static asset paths location /pancake/third_party { alias ${NIXPACKS_PHP_ROOT_DIR}/pancake/third_party; # ... caching headers } # Added pretty URL support location /pancake { try_files $uri $uri/ @pancake_fallback; } location @pancake_fallback { rewrite ^.*$ /pancake/index.php last; } # Fixed PHP handling location ~ \.php$ { fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; # ... other fastcgi params } ``` **Result**: - ✅ Static assets serving correctly - ✅ Pretty URLs working (bendtstudio.com/pancake/admin) - ✅ Apache .htaccess functionality replicated in nginx ### 3. DNS Resolution Fix (Completed ✅) **Problem**: Docker containers couldn't resolve `ghcr.io` causing build failures **Root Cause**: No proper DNS forwarding for containers **Solution Implemented**: ```bash # Install dnsmasq sudo apt install -y dnsmasq # Configure DNS forwarding cat > /etc/dnsmasq.conf << EOF server=8.8.8.8 server=8.8.4.4 server=1.1.1.1 listen-address=127.0.0.1 bind-interfaces EOF # Start dnsmasq sudo systemctl enable dnsmasq && sudo systemctl start dnsmasq # Update system DNS echo 'nameserver 127.0.0.1' | sudo tee /etc/resolv.conf # Update Docker to use local DNS echo '{"data-root": "/mnt/nvme/docker", "dns": ["127.0.0.1"]}' | sudo tee /etc/docker/daemon.json sudo systemctl restart docker ``` **Result**: - ✅ ghcr.io resolution working - ✅ Docker builds successful - ✅ Deployments working via Dokploy UI ## Service-Specific Notes ### Gitea (Git Server) - **SSH Access**: `git@gitea.bendtstudio.com:2222` or `git@gitea.bendtstudio.com:username/repo.git` - **Web UI**: https://gitea.bendtstudio.com - **Status**: Working correctly, SSH authentication successful - **Note**: "No shell access" message is normal for Gitea ### Dokploy (Deployment Management) - **Web UI**: https://dokploy.bendtstudio.com (port 3000) - **Usage**: 1. Push code to Gitea repository 2. Dokploy automatically detects new commits 3. Trigger manual redeployment via web UI 4. Monitor build logs in real-time - **Build Process**: Uses Nixpacks for containerization - **Current Status**: ✅ Working with DNS fix ### Bendtstudio Web Application - **Domain**: https://bendtstudio.com - **Pancake App**: https://bendtstudio.com/pancake - **Replicas**: 5 containers for load balancing - **Static Assets**: All serving correctly from `/pancake/third_party/` - **Database**: MariaDB container for Pancake data ## Troubleshooting Procedures ### Docker Issues ```bash # Check Docker status sudo systemctl status docker # Check container logs docker logs # Check service status docker service ls # Restart Docker daemon sudo systemctl restart docker ``` ### DNS Issues ```bash # Check DNS resolution nslookup ghcr.io # Test from container docker exec curl -I https://ghcr.io # Restart dnsmasq sudo systemctl restart dnsmasq # Check Docker DNS config cat /etc/docker/daemon.json ``` ### Network Issues ```bash # Check port mapping docker port # Test external access nc -v # Check Traefik routes curl -s http://localhost:8080/api/http/routers # Check container networks docker inspect --format '{{json .NetworkSettings.Networks}}' ``` ### Application Deployment Issues ```bash # Check deployment logs docker service logs --tail 50 # Force redeployment docker service update --force # Check service configuration docker service inspect # Scale services docker service scale = ``` ## Monitoring Commands ### System Resources ```bash # Disk usage df -h # Memory usage free -h # Docker space usage docker system df # Container resource usage docker stats ``` ### Docker Swarm Health ```bash # Check swarm status docker node ls # Check service health docker service ls # Check individual services docker service ps ``` ## Configuration Files ### Docker Daemon Configuration ```json { "data-root": "/mnt/nvme/docker", "dns": ["127.0.0.1"] } ``` ### Nginx Template Key Sections ```nginx # Static assets for pancake/third_party location /pancake/third_party { alias ${NIXPACKS_PHP_ROOT_DIR}/pancake/third_party; expires 1y; add_header Cache-Control "public, immutable"; } # Pretty URLs for Pancake location /pancake { try_files $uri $uri/ @pancake_fallback; } location @pancake_fallback { rewrite ^.*$ /pancake/index.php last; } ``` ## Future Improvements ### DNS Enhancement - Configure dnsmasq to forward internal domains to local DNS server - Set up conditional forwarding for homelab services - Add DNS caching for better performance ### Backup Strategy - Regular backups of Docker volumes to NVMe - Automated snapshots of configuration files - Git repository tracking of all changes ### Monitoring - Set up Prometheus/Grafana for system monitoring - Log aggregation for better troubleshooting - Alert configuration for critical services ## Emergency Procedures ### Full System Recovery ```bash # 1. Check all services docker service ls # 2. Restart critical services docker service update --force dokploy docker service update --force traefik # 3. Check DNS resolution curl -I https://ghcr.io # 4. Verify storage df -h docker system df ``` ### Service Restoration ```bash # Restore from backup if needed docker volume ls docker volume restore # Re-deploy from last known good state git log --oneline -10 git checkout ``` --- **Last Updated**: 2025-11-29 **Maintainer**: sirtimbly **Environment**: Production Docker Swarm Cluster