191 lines
4.6 KiB
Markdown
191 lines
4.6 KiB
Markdown
# Infrastructure Management
|
|
|
|
This directory contains backups, scripts, and documentation for managing the homelab infrastructure.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
infrastructure/
|
|
├── compose/ # Docker Compose files backed up from cluster
|
|
├── stacks/ # Docker Stack definitions
|
|
├── traefik/ # Traefik configuration backups
|
|
├── scripts/ # Management and backup scripts
|
|
├── backups/ # Critical data backups (created by scripts)
|
|
└── BACKUP_MANIFEST.md # Auto-generated backup manifest
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### 1. Backup Compose Files
|
|
|
|
Run this to back up all compose configurations from the cluster:
|
|
|
|
```bash
|
|
./scripts/backup-compose-files.sh
|
|
```
|
|
|
|
This will:
|
|
- Copy all Dokploy compose files from `/etc/dokploy/compose/`
|
|
- Copy Traefik configuration
|
|
- Copy stack files
|
|
- Generate a backup manifest
|
|
|
|
### 2. Backup Critical Data
|
|
|
|
Run this to back up databases and application data:
|
|
|
|
```bash
|
|
./scripts/backup-critical-data.sh
|
|
```
|
|
|
|
This will:
|
|
- Backup PostgreSQL databases (Dokploy, Immich, BewCloud)
|
|
- Backup MariaDB databases (Pancake)
|
|
- Backup application volumes (Memos, Gitea)
|
|
- Clean up old backups (30+ days)
|
|
- Generate a backup report
|
|
|
|
## Automated Backups
|
|
|
|
### Set up cron jobs on your local machine:
|
|
|
|
```bash
|
|
# Edit crontab
|
|
crontab -e
|
|
|
|
# Add these lines:
|
|
# Backup compose files daily at 2 AM
|
|
0 2 * * * cd /Users/timothy.bendt/developer/cloud-compose/infrastructure/scripts && ./backup-compose-files.sh >> /var/log/homelab-backup.log 2>&1
|
|
|
|
# Backup critical data daily at 3 AM
|
|
0 3 * * * cd /Users/timothy.bendt/developer/cloud-compose/infrastructure/scripts && ./backup-critical-data.sh >> /var/log/homelab-backup.log 2>&1
|
|
```
|
|
|
|
### Or run manually whenever you make changes:
|
|
|
|
```bash
|
|
# After modifying any service
|
|
cd /Users/timothy.bendt/developer/cloud-compose
|
|
./infrastructure/scripts/backup-compose-files.sh
|
|
git add infrastructure/
|
|
git commit -m "Backup infrastructure configs"
|
|
git push
|
|
```
|
|
|
|
## Restore Procedures
|
|
|
|
### Restore a Compose File
|
|
|
|
1. Copy the compose file from `infrastructure/compose/<project>/docker-compose.yml`
|
|
2. Upload via Dokploy UI
|
|
3. Deploy
|
|
|
|
### Restore a Database
|
|
|
|
```bash
|
|
# Example: Restore Dokploy database
|
|
scp infrastructure/backups/dokploy-postgres-dokploy-2026-02-09.sql ubuntu@192.168.2.130:/tmp/
|
|
ssh ubuntu@192.168.2.130 "docker exec -i dokploy-postgres.1.<container-id> psql -U postgres dokploy < /tmp/dokploy-postgres-dokploy-2026-02-09.sql"
|
|
```
|
|
|
|
### Restore Volume Data
|
|
|
|
```bash
|
|
# Example: Restore Memos data
|
|
scp infrastructure/backups/bewcloud-memos-ssogxn-memos-data-2026-02-09.tar.gz ubuntu@192.168.2.130:/tmp/
|
|
ssh ubuntu@192.168.2.130 "docker run --rm -v bewcloud-memos-ssogxn_memos_data:/data -v /tmp:/backup alpine sh -c 'cd /data && tar xzf /backup/bewcloud-memos-ssogxn-memos-data-2026-02-09.tar.gz'"
|
|
```
|
|
|
|
## SSH Access
|
|
|
|
### Controller (tpi-n1)
|
|
```bash
|
|
ssh -i ~/.ssh/id_ed25519 ubuntu@192.168.2.130
|
|
```
|
|
|
|
### Worker (tpi-n2)
|
|
```bash
|
|
ssh -i ~/.ssh/id_ed25519 ubuntu@192.168.2.19
|
|
```
|
|
|
|
### NAS (node-nas)
|
|
```bash
|
|
ssh tim@192.168.2.18
|
|
```
|
|
|
|
## Useful Commands
|
|
|
|
### Check Service Status
|
|
```bash
|
|
ssh ubuntu@192.168.2.130 "docker service ls"
|
|
```
|
|
|
|
### View Service Logs
|
|
```bash
|
|
ssh ubuntu@192.168.2.130 "docker service logs <service-name> --tail 100 -f"
|
|
```
|
|
|
|
### Scale a Service
|
|
```bash
|
|
ssh ubuntu@192.168.2.130 "docker service scale <service-name>=<replicas>"
|
|
```
|
|
|
|
### Check Node Status
|
|
```bash
|
|
ssh ubuntu@192.168.2.130 "docker node ls"
|
|
```
|
|
|
|
## Web Interfaces
|
|
|
|
| Service | URL | Purpose |
|
|
|---------|-----|---------|
|
|
| Dokploy | http://192.168.2.130:3000 | Container management |
|
|
| Swarmpit | http://192.168.2.130:888 | Swarm monitoring |
|
|
| Traefik | http://192.168.2.130:8080 | Reverse proxy dashboard |
|
|
| MinIO | http://192.168.2.18:9001 | Object storage console |
|
|
|
|
## Backup Storage
|
|
|
|
### Local
|
|
Backups are stored in `infrastructure/backups/` with date stamps.
|
|
|
|
### Offsite (Recommended)
|
|
Consider copying backups to:
|
|
- MinIO bucket (`backups/`)
|
|
- External hard drive
|
|
- Cloud storage (AWS S3, etc.)
|
|
- Another server
|
|
|
|
Example:
|
|
```bash
|
|
# Copy to MinIO
|
|
mc cp infrastructure/backups/* minio/backups/
|
|
```
|
|
|
|
## Maintenance Checklist
|
|
|
|
### Daily
|
|
- [ ] Check backup logs for errors
|
|
- [ ] Verify critical services are running
|
|
|
|
### Weekly
|
|
- [ ] Review Swarmpit dashboard
|
|
- [ ] Check disk usage on all nodes
|
|
- [ ] Review backup integrity
|
|
|
|
### Monthly
|
|
- [ ] Test restore procedures
|
|
- [ ] Update documentation
|
|
- [ ] Review and update services
|
|
- [ ] Clean up unused images/volumes
|
|
|
|
### Quarterly
|
|
- [ ] Full disaster recovery drill
|
|
- [ ] Security audit
|
|
- [ ] Update base images
|
|
- [ ] Review access controls
|
|
|
|
---
|
|
|
|
*Infrastructure Management Guide - February 2026*
|