Compare commits
2 Commits
040f0d2d15
...
73e80f6533
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
73e80f6533 | ||
|
|
88cf1003a8 |
322
agents.md
Normal file
322
agents.md
Normal file
@@ -0,0 +1,322 @@
|
|||||||
|
# Homelab Infrastructure & Deployment Notes
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
This document contains comprehensive notes about the homelab Docker Swarm cluster setup, troubleshooting procedures, and deployment configurations.
|
||||||
|
|
||||||
|
## Infrastructure Architecture
|
||||||
|
|
||||||
|
### Host Configuration
|
||||||
|
- **Main Node**: `tpi-n1` (192.168.2.130) - Docker Swarm Manager
|
||||||
|
- **Worker Node**: `tpi-n2` - Docker Swarm Worker
|
||||||
|
- **Storage**:
|
||||||
|
- Main drive: `/dev/mmcblk0p2` (29GB) - System and applications
|
||||||
|
- NVMe drive: `/mnt/nvme` (916GB) - Docker data storage
|
||||||
|
|
||||||
|
### Docker Configuration
|
||||||
|
- **Data Directory**: `/mnt/nvme/docker` (moved from `/var/lib/docker`)
|
||||||
|
- **DNS Configuration**: Local dnsmasq forwarder at 127.0.0.1
|
||||||
|
- **External DNS**: 8.8.8.8, 8.8.4.4, 1.1.1.1
|
||||||
|
- **Docker Daemon Config**: `/etc/docker/daemon.json`
|
||||||
|
|
||||||
|
### Services Running
|
||||||
|
- **Traefik**: Load balancer and SSL termination
|
||||||
|
- **Dokploy**: Deployment management (port 3000)
|
||||||
|
- **Gitea**: Git server (port 2222 for SSH)
|
||||||
|
- **Swarmpit**: Docker Swarm management UI (port 888)
|
||||||
|
- **Bendtstudio**: Main web application (5 replicas)
|
||||||
|
- **MariaDB**: Database for Pancake application
|
||||||
|
|
||||||
|
## Major Maintenance Tasks
|
||||||
|
|
||||||
|
### 1. Docker Data Migration (Completed ✅)
|
||||||
|
**Problem**: Main drive 100% full (28G/29G used)
|
||||||
|
**Solution**: Moved 19GB Docker data to NVMe drive
|
||||||
|
|
||||||
|
**Commands Used**:
|
||||||
|
```bash
|
||||||
|
# Stop Docker services
|
||||||
|
sudo systemctl stop docker docker.socket
|
||||||
|
|
||||||
|
# Move data to NVMe
|
||||||
|
sudo cp -a /var/lib/docker /mnt/nvme/
|
||||||
|
|
||||||
|
# Update Docker config
|
||||||
|
echo '{"data-root": "/mnt/nvme/docker"}' | sudo tee /etc/docker/daemon.json
|
||||||
|
|
||||||
|
# Restart Docker
|
||||||
|
sudo systemctl start docker
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result**:
|
||||||
|
- Freed 15GB on main drive (100% → 46% usage)
|
||||||
|
- Docker data on fast NVMe storage
|
||||||
|
- All services maintained without downtime
|
||||||
|
|
||||||
|
### 2. Nginx Configuration Fix (Completed ✅)
|
||||||
|
**Problem**: Pancake static assets returning 404, pretty URLs not working
|
||||||
|
**Root Cause**: Apache .htaccess rules not translated to nginx properly
|
||||||
|
|
||||||
|
**Files Modified**:
|
||||||
|
- `nginx.template.conf` - Main configuration template
|
||||||
|
- All running containers - Updated nginx configuration
|
||||||
|
|
||||||
|
**Key Changes**:
|
||||||
|
```nginx
|
||||||
|
# Fixed static asset paths
|
||||||
|
location /pancake/third_party {
|
||||||
|
alias ${NIXPACKS_PHP_ROOT_DIR}/pancake/third_party;
|
||||||
|
# ... caching headers
|
||||||
|
}
|
||||||
|
|
||||||
|
# Added pretty URL support
|
||||||
|
location /pancake {
|
||||||
|
try_files $uri $uri/ @pancake_fallback;
|
||||||
|
}
|
||||||
|
|
||||||
|
location @pancake_fallback {
|
||||||
|
rewrite ^.*$ /pancake/index.php last;
|
||||||
|
}
|
||||||
|
|
||||||
|
# Fixed PHP handling
|
||||||
|
location ~ \.php$ {
|
||||||
|
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
|
||||||
|
# ... other fastcgi params
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result**:
|
||||||
|
- ✅ Static assets serving correctly
|
||||||
|
- ✅ Pretty URLs working (bendtstudio.com/pancake/admin)
|
||||||
|
- ✅ Apache .htaccess functionality replicated in nginx
|
||||||
|
|
||||||
|
### 3. DNS Resolution Fix (Completed ✅)
|
||||||
|
**Problem**: Docker containers couldn't resolve `ghcr.io` causing build failures
|
||||||
|
**Root Cause**: No proper DNS forwarding for containers
|
||||||
|
|
||||||
|
**Solution Implemented**:
|
||||||
|
```bash
|
||||||
|
# Install dnsmasq
|
||||||
|
sudo apt install -y dnsmasq
|
||||||
|
|
||||||
|
# Configure DNS forwarding
|
||||||
|
cat > /etc/dnsmasq.conf << EOF
|
||||||
|
server=8.8.8.8
|
||||||
|
server=8.8.4.4
|
||||||
|
server=1.1.1.1
|
||||||
|
listen-address=127.0.0.1
|
||||||
|
bind-interfaces
|
||||||
|
EOF
|
||||||
|
|
||||||
|
# Start dnsmasq
|
||||||
|
sudo systemctl enable dnsmasq && sudo systemctl start dnsmasq
|
||||||
|
|
||||||
|
# Update system DNS
|
||||||
|
echo 'nameserver 127.0.0.1' | sudo tee /etc/resolv.conf
|
||||||
|
|
||||||
|
# Update Docker to use local DNS
|
||||||
|
echo '{"data-root": "/mnt/nvme/docker", "dns": ["127.0.0.1"]}' | sudo tee /etc/docker/daemon.json
|
||||||
|
sudo systemctl restart docker
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result**:
|
||||||
|
- ✅ ghcr.io resolution working
|
||||||
|
- ✅ Docker builds successful
|
||||||
|
- ✅ Deployments working via Dokploy UI
|
||||||
|
|
||||||
|
## Service-Specific Notes
|
||||||
|
|
||||||
|
### Gitea (Git Server)
|
||||||
|
- **SSH Access**: `git@gitea.bendtstudio.com:2222` or `git@gitea.bendtstudio.com:username/repo.git`
|
||||||
|
- **Web UI**: https://gitea.bendtstudio.com
|
||||||
|
- **Status**: Working correctly, SSH authentication successful
|
||||||
|
- **Note**: "No shell access" message is normal for Gitea
|
||||||
|
|
||||||
|
### Dokploy (Deployment Management)
|
||||||
|
- **Web UI**: https://dokploy.bendtstudio.com (port 3000)
|
||||||
|
- **Usage**:
|
||||||
|
1. Push code to Gitea repository
|
||||||
|
2. Dokploy automatically detects new commits
|
||||||
|
3. Trigger manual redeployment via web UI
|
||||||
|
4. Monitor build logs in real-time
|
||||||
|
- **Build Process**: Uses Nixpacks for containerization
|
||||||
|
- **Current Status**: ✅ Working with DNS fix
|
||||||
|
|
||||||
|
### Bendtstudio Web Application
|
||||||
|
- **Domain**: https://bendtstudio.com
|
||||||
|
- **Pancake App**: https://bendtstudio.com/pancake
|
||||||
|
- **Replicas**: 5 containers for load balancing
|
||||||
|
- **Static Assets**: All serving correctly from `/pancake/third_party/`
|
||||||
|
- **Database**: MariaDB container for Pancake data
|
||||||
|
|
||||||
|
## Troubleshooting Procedures
|
||||||
|
|
||||||
|
### Docker Issues
|
||||||
|
```bash
|
||||||
|
# Check Docker status
|
||||||
|
sudo systemctl status docker
|
||||||
|
|
||||||
|
# Check container logs
|
||||||
|
docker logs <container_name>
|
||||||
|
|
||||||
|
# Check service status
|
||||||
|
docker service ls
|
||||||
|
|
||||||
|
# Restart Docker daemon
|
||||||
|
sudo systemctl restart docker
|
||||||
|
```
|
||||||
|
|
||||||
|
### DNS Issues
|
||||||
|
```bash
|
||||||
|
# Check DNS resolution
|
||||||
|
nslookup ghcr.io
|
||||||
|
|
||||||
|
# Test from container
|
||||||
|
docker exec <container> curl -I https://ghcr.io
|
||||||
|
|
||||||
|
# Restart dnsmasq
|
||||||
|
sudo systemctl restart dnsmasq
|
||||||
|
|
||||||
|
# Check Docker DNS config
|
||||||
|
cat /etc/docker/daemon.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### Network Issues
|
||||||
|
```bash
|
||||||
|
# Check port mapping
|
||||||
|
docker port <container_name>
|
||||||
|
|
||||||
|
# Test external access
|
||||||
|
nc -v <host_ip> <port>
|
||||||
|
|
||||||
|
# Check Traefik routes
|
||||||
|
curl -s http://localhost:8080/api/http/routers
|
||||||
|
|
||||||
|
# Check container networks
|
||||||
|
docker inspect <container> --format '{{json .NetworkSettings.Networks}}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Application Deployment Issues
|
||||||
|
```bash
|
||||||
|
# Check deployment logs
|
||||||
|
docker service logs <service_name> --tail 50
|
||||||
|
|
||||||
|
# Force redeployment
|
||||||
|
docker service update --force <service_name>
|
||||||
|
|
||||||
|
# Check service configuration
|
||||||
|
docker service inspect <service_name>
|
||||||
|
|
||||||
|
# Scale services
|
||||||
|
docker service scale <service_name>=<replicas>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Monitoring Commands
|
||||||
|
|
||||||
|
### System Resources
|
||||||
|
```bash
|
||||||
|
# Disk usage
|
||||||
|
df -h
|
||||||
|
|
||||||
|
# Memory usage
|
||||||
|
free -h
|
||||||
|
|
||||||
|
# Docker space usage
|
||||||
|
docker system df
|
||||||
|
|
||||||
|
# Container resource usage
|
||||||
|
docker stats
|
||||||
|
```
|
||||||
|
|
||||||
|
### Docker Swarm Health
|
||||||
|
```bash
|
||||||
|
# Check swarm status
|
||||||
|
docker node ls
|
||||||
|
|
||||||
|
# Check service health
|
||||||
|
docker service ls
|
||||||
|
|
||||||
|
# Check individual services
|
||||||
|
docker service ps <service_name>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Configuration Files
|
||||||
|
|
||||||
|
### Docker Daemon Configuration
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"data-root": "/mnt/nvme/docker",
|
||||||
|
"dns": ["127.0.0.1"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Nginx Template Key Sections
|
||||||
|
```nginx
|
||||||
|
# Static assets for pancake/third_party
|
||||||
|
location /pancake/third_party {
|
||||||
|
alias ${NIXPACKS_PHP_ROOT_DIR}/pancake/third_party;
|
||||||
|
expires 1y;
|
||||||
|
add_header Cache-Control "public, immutable";
|
||||||
|
}
|
||||||
|
|
||||||
|
# Pretty URLs for Pancake
|
||||||
|
location /pancake {
|
||||||
|
try_files $uri $uri/ @pancake_fallback;
|
||||||
|
}
|
||||||
|
|
||||||
|
location @pancake_fallback {
|
||||||
|
rewrite ^.*$ /pancake/index.php last;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Future Improvements
|
||||||
|
|
||||||
|
### DNS Enhancement
|
||||||
|
- Configure dnsmasq to forward internal domains to local DNS server
|
||||||
|
- Set up conditional forwarding for homelab services
|
||||||
|
- Add DNS caching for better performance
|
||||||
|
|
||||||
|
### Backup Strategy
|
||||||
|
- Regular backups of Docker volumes to NVMe
|
||||||
|
- Automated snapshots of configuration files
|
||||||
|
- Git repository tracking of all changes
|
||||||
|
|
||||||
|
### Monitoring
|
||||||
|
- Set up Prometheus/Grafana for system monitoring
|
||||||
|
- Log aggregation for better troubleshooting
|
||||||
|
- Alert configuration for critical services
|
||||||
|
|
||||||
|
## Emergency Procedures
|
||||||
|
|
||||||
|
### Full System Recovery
|
||||||
|
```bash
|
||||||
|
# 1. Check all services
|
||||||
|
docker service ls
|
||||||
|
|
||||||
|
# 2. Restart critical services
|
||||||
|
docker service update --force dokploy
|
||||||
|
docker service update --force traefik
|
||||||
|
|
||||||
|
# 3. Check DNS resolution
|
||||||
|
curl -I https://ghcr.io
|
||||||
|
|
||||||
|
# 4. Verify storage
|
||||||
|
df -h
|
||||||
|
docker system df
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Restoration
|
||||||
|
```bash
|
||||||
|
# Restore from backup if needed
|
||||||
|
docker volume ls
|
||||||
|
docker volume restore <volume_name> <backup_file>
|
||||||
|
|
||||||
|
# Re-deploy from last known good state
|
||||||
|
git log --oneline -10
|
||||||
|
git checkout <commit_hash>
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated**: 2025-11-29
|
||||||
|
**Maintainer**: sirtimbly
|
||||||
|
**Environment**: Production Docker Swarm Cluster
|
||||||
@@ -82,7 +82,7 @@ http {
|
|||||||
|
|
||||||
# Static assets for pancake/third_party
|
# Static assets for pancake/third_party
|
||||||
location /pancake/third_party {
|
location /pancake/third_party {
|
||||||
alias ${NIXPACKS_PHP_ROOT_DIR}/pancake/third_party;
|
alias /app/pancake/third_party;
|
||||||
expires 1y;
|
expires 1y;
|
||||||
add_header Cache-Control "public, immutable";
|
add_header Cache-Control "public, immutable";
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user