# GitHub Actions Self-Hosted Runners Deployment Plan

## Overview

**Goal:** Deploy GitHub Actions self-hosted runners on your Docker Swarm cluster to run CI/CD workflows with unlimited minutes, custom environments, and access to your homelab resources.

**Architecture:** Docker-based runners deployed as a Swarm service with auto-scaling capabilities.

---

## Architecture Decision

### Option 1: Docker Container Runners (Recommended for your setup)
- ✅ Runs in Docker containers on your existing cluster
- ✅ Scales horizontally by adding/removing containers
- ✅ Uses your existing infrastructure (tpi-n1, tpi-n2, node-nas)
- ✅ Easy to manage through Docker Swarm
- ✅ ARM64 and x86_64 support for multi-arch builds

### Option 2: VM/Physical Runners (Alternative)
- Runners installed directly on VMs or bare metal
- More isolated but harder to manage
- Not recommended for your containerized setup

**Decision:** Use Docker Container Runners (Option 1) with multi-arch support.

---

## Deployment Architecture

```
GitHub Repository
        │
        │ Webhook/REST API
        ▼
┌─────────────────────────────┐
│  GitHub Actions Service     │
└─────────────────────────────┘
        │
        │ Job Request
        ▼
┌─────────────────────────────┐
│  Your Docker Swarm Cluster  │
│                             │
│  ┌─────────────────────┐   │
│  │  Runner Service     │   │
│  │  (Multiple Replicas)│   │
│  │                     │   │
│  │  ┌─────┐ ┌─────┐   │   │
│  │  │ ARM │ │x86_64│   │   │
│  │  │64   │ │     │   │   │
│  │  └─────┘ └─────┘   │   │
│  └─────────────────────┘   │
│                             │
│  ┌─────────────────────┐   │
│  │  Docker-in-Docker   │   │
│  │  (for Docker builds)│   │
│  └─────────────────────┘   │
│                             │
└─────────────────────────────┘
```

---

## Phase 1: Planning & Preparation

### Step 1: Determine Requirements

**Use Cases:**
- [ ] Build and test applications
- [ ] Deploy to your homelab (Kubernetes/Docker Swarm)
- [ ] Run ARM64 builds (for Raspberry Pi/ARM apps)
- [ ] Run x86_64 builds (standard applications)
- [ ] Access private network resources (databases, internal APIs)
- [ ] Build Docker images and push to your Gitea registry

**Resource Requirements per Runner:**
- CPU: 2+ cores recommended
- Memory: 4GB+ RAM per runner
- Disk: 20GB+ for workspace and Docker layers
- Network: Outbound HTTPS to GitHub

**Current Cluster Capacity:**
- tpi-n1: 8 cores ARM64, 8GB RAM (Manager)
- tpi-n2: 8 cores ARM64, 8GB RAM (Worker)
- node-nas: 2 cores x86_64, 8GB RAM (Storage)

**Recommended Allocation:**
- 2 runners on tpi-n1 (ARM64)
- 2 runners on tpi-n2 (ARM64)
- 1 runner on node-nas (x86_64)

### Step 2: GitHub Configuration

**Choose Runner Level:**
- [ ] **Repository-level** - Dedicated to specific repo (recommended to start)
- [ ] **Organization-level** - Shared across org repos
- [ ] **Enterprise-level** - Shared across enterprise

**For your use case:** Start with **repository-level** runners, then expand to organization-level if needed.

**Required GitHub Settings:**
1. Go to: `Settings > Actions > Runners > New self-hosted runner`
2. Note the **Registration Token** (expires after 1 hour)
3. Note the **Runner Group** (default: "Default")
4. Configure labels (e.g., `homelab`, `arm64`, `x86_64`, `self-hosted`)

---

## Phase 2: Infrastructure Setup

### Step 3: Create Docker Network

```bash
# On controller (tpi-n1)
ssh ubuntu@192.168.2.130

# Create overlay network for runners
docker network create --driver overlay --attachable github-runners-network

# Verify
docker network ls | grep github
```

### Step 4: Create Persistent Storage

```bash
# Create volume for runner cache (shared across runners)
docker volume create github-runner-cache

# Create volume for Docker build cache
docker volume create github-runner-docker-cache
```

### Step 5: Prepare Node Labels

```bash
# Verify node labels
ssh ubuntu@192.168.2.130
docker node ls --format '{{.Hostname}} {{.Labels}}'

# Expected output:
# tpi-n1      map[infra:true role:storage storage:high]
# tpi-n2      map[role:compute]
# node-nas    map[type:nas]

# Add architecture labels if missing:
docker node update --label-add arch=arm64 tpi-n1
docker node update --label-add arch=arm64 tpi-n2
docker node update --label-add arch=x86_64 node-nas
```

---

## Phase 3: Runner Deployment

### Step 6: Create Environment File

Create `.env` file:
```bash
# GitHub Configuration
GITHUB_TOKEN=your_github_personal_access_token
GITHUB_OWNER=your-github-username-or-org
GITHUB_REPO=your-repository-name  # Leave empty for org-level

# Runner Configuration
RUNNER_NAME_PREFIX=homelab
RUNNER_LABELS=self-hosted,homelab,linux
RUNNER_GROUP=Default

# Docker Configuration
DOCKER_TLS_CERTDIR=/certs

# Optional: Pre-installed tools
PRE_INSTALL_TOOLS="docker-compose,nodejs,npm,yarn,python3,pip,git"
```

### Step 7: Create Docker Compose Stack

Create `github-runners-stack.yml`:

```yaml
version: "3.8"

services:
  # ARM64 Runners
  runner-arm64:
    image: myoung34/github-runner:latest
    environment:
      - ACCESS_TOKEN=${GITHUB_TOKEN}
      - REPO_URL=https://github.com/${GITHUB_OWNER}/${GITHUB_REPO}
      - RUNNER_NAME=${RUNNER_NAME_PREFIX}-arm64-{{.Task.Slot}}
      - RUNNER_WORKDIR=/tmp/runner-work
      - RUNNER_GROUP=${RUNNER_GROUP:-Default}
      - RUNNER_SCOPE=repo
      - LABELS=${RUNNER_LABELS},arm64
      - DISABLE_AUTO_UPDATE=true
      - EPHEMERAL=true  # One job per container
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - github-runner-cache:/home/runner/cache
      - github-runner-docker-cache:/var/lib/docker
    networks:
      - github-runners-network
      - dokploy-network
    deploy:
      mode: replicated
      replicas: 2
      placement:
        constraints:
          - node.labels.arch == arm64
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
    privileged: true  # Required for Docker-in-Docker

  # x86_64 Runners
  runner-x86_64:
    image: myoung34/github-runner:latest
    environment:
      - ACCESS_TOKEN=${GITHUB_TOKEN}
      - REPO_URL=https://github.com/${GITHUB_OWNER}/${GITHUB_REPO}
      - RUNNER_NAME=${RUNNER_NAME_PREFIX}-x86_64-{{.Task.Slot}}
      - RUNNER_WORKDIR=/tmp/runner-work
      - RUNNER_GROUP=${RUNNER_GROUP:-Default}
      - RUNNER_SCOPE=repo
      - LABELS=${RUNNER_LABELS},x86_64
      - DISABLE_AUTO_UPDATE=true
      - EPHEMERAL=true
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - github-runner-cache:/home/runner/cache
      - github-runner-docker-cache:/var/lib/docker
    networks:
      - github-runners-network
      - dokploy-network
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.labels.arch == x86_64
      restart_policy:
        condition: any
        delay: 5s
        max_attempts: 3
    privileged: true

  # Optional: Runner Autoscaler
  autoscaler:
    image: ghcr.io/actions-runner-controller/actions-runner-controller:latest
    environment:
      - GITHUB_TOKEN=${GITHUB_TOKEN}
      - RUNNER_SCOPE=repo
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    networks:
      - github-runners-network
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager

volumes:
  github-runner-cache:
  github-runner-docker-cache:

networks:
  github-runners-network:
    driver: overlay
  dokploy-network:
    external: true
```

### Step 8: Deploy Runners

```bash
# Copy files to controller
scp github-runners-stack.yml ubuntu@192.168.2.130:~/
scp .env ubuntu@192.168.2.130:~/

# SSH to controller
ssh ubuntu@192.168.2.130

# Load environment
set -a && source .env && set +a

# Deploy stack
docker stack deploy -c github-runners-stack.yml github-runners

# Verify deployment
docker stack ps github-runners
docker service ls | grep github
```

---

## Phase 4: GitHub Integration

### Step 9: Verify Runners in GitHub

1. Go to: `https://github.com/[OWNER]/[REPO]/settings/actions/runners`
2. You should see your runners listed as "Idle"
3. Labels should show: `self-hosted`, `homelab`, `linux`, `arm64` or `x86_64`

### Step 10: Test with Sample Workflow

Create `.github/workflows/test-self-hosted.yml`:

```yaml
name: Test Self-Hosted Runners

on:
  push:
    branches: [ main ]
  workflow_dispatch:

jobs:
  test-arm64:
    runs-on: [self-hosted, homelab, arm64]
    steps:
      - uses: actions/checkout@v4
      
      - name: Show runner info
        run: |
          echo "Architecture: $(uname -m)"
          echo "OS: $(uname -s)"
          echo "Node: $(hostname)"
          echo "CPU: $(nproc)"
          echo "Memory: $(free -h | grep Mem)"
      
      - name: Test Docker
        run: |
          docker --version
          docker info
          docker run --rm hello-world

  test-x86_64:
    runs-on: [self-hosted, homelab, x86_64]
    steps:
      - uses: actions/checkout@v4
      
      - name: Show runner info
        run: |
          echo "Architecture: $(uname -m)"
          echo "OS: $(uname -s)"
          echo "Node: $(hostname)"
      
      - name: Test access to homelab
        run: |
          # Test connectivity to your services
          curl -s http://gitea.bendtstudio.com:3000 || echo "Gitea not accessible"
          curl -s http://192.168.2.130:3000 || echo "Dokploy not accessible"
```

---

## Phase 5: Security Hardening

### Step 11: Implement Security Best Practices

**1. Use Short-Lived Tokens:**
```bash
# Generate a GitHub App instead of PAT for better security
# Or use OpenID Connect (OIDC) for authentication
```

**2. Restrict Runner Permissions:**
```yaml
# Add to workflow
jobs:
  build:
    runs-on: [self-hosted, homelab]
    permissions:
      contents: read
      packages: write  # Only if pushing to registry
```

**3. Network Isolation:**
```yaml
# Modify stack to use isolated network
networks:
  github-runners-network:
    driver: overlay
    internal: true  # No external access except through proxy
```

**4. Resource Limits:**
```yaml
# Add to service definition in stack
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 4G
    reservations:
      cpus: '1'
      memory: 2G
```

### Step 12: Enable Ephemeral Mode

Ephemeral runners (already configured with `EPHEMERAL=true`) provide better security:
- Each runner handles only one job
- Container is destroyed after job completion
- Fresh environment for every build
- Prevents credential leakage between jobs

---

## Phase 6: Monitoring & Maintenance

### Step 13: Set Up Monitoring

**Create monitoring script** (`monitor-runners.sh`):
```bash
#!/bin/bash

# Check runner status
echo "=== Docker Service Status ==="
docker service ls | grep github-runner

echo -e "\n=== Runner Containers ==="
docker ps --filter name=github-runner --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

echo -e "\n=== Recent Logs ==="
docker service logs github-runners_runner-arm64 --tail 50
docker service logs github-runners_runner-x86_64 --tail 50

echo -e "\n=== Resource Usage ==="
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep github-runner
```

**Create cron job for monitoring:**
```bash
# Add to crontab
crontab -e

# Check runner health every 5 minutes
*/5 * * * * /home/ubuntu/github-runners/monitor-runners.sh >> /var/log/github-runners.log 2>&1
```

### Step 14: Set Up Log Rotation

```bash
# Create logrotate config
sudo tee /etc/logrotate.d/github-runners << EOF
/var/log/github-runners.log {
    daily
    rotate 7
    compress
    delaycompress
    missingok
    notifempty
    create 644 ubuntu ubuntu
}
EOF
```

### Step 15: Backup Strategy

```bash
# Create backup script
#!/bin/bash
BACKUP_DIR="/backup/github-runners/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

# Backup configuration
cp ~/github-runners-stack.yml "$BACKUP_DIR/"
cp ~/.env "$BACKUP_DIR/"

# Backup volumes
docker run --rm -v github-runner-cache:/data -v "$BACKUP_DIR":/backup alpine tar czf /backup/runner-cache.tar.gz -C /data .
docker run --rm -v github-runner-docker-cache:/data -v "$BACKUP_DIR":/backup alpine tar czf /backup/docker-cache.tar.gz -C /data .

echo "Backup completed: $BACKUP_DIR"
```

---

## Phase 7: Advanced Configuration

### Step 16: Cache Optimization

**Mount host cache directories:**
```yaml
volumes:
  - /home/ubuntu/.cache/npm:/root/.npm
  - /home/ubuntu/.cache/pip:/root/.cache/pip
  - /home/ubuntu/.cache/go-build:/root/.cache/go-build
  - /home/ubuntu/.cargo:/root/.cargo
```

**Pre-install common tools in custom image** (`Dockerfile.runner`):
```dockerfile
FROM myoung34/github-runner:latest

# Install common build tools
RUN apt-get update && apt-get install -y \
    build-essential \
    nodejs \
    npm \
    python3 \
    python3-pip \
    golang-go \
    openjdk-17-jdk \
    maven \
    gradle \
    && rm -rf /var/lib/apt/lists/*

# Install Docker Compose
RUN curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" \
    -o /usr/local/bin/docker-compose && \
    chmod +x /usr/local/bin/docker-compose

# Pre-pull common images
RUN docker pull node:lts-alpine
RUN docker pull python:3.11-slim
```

Build and use custom image:
```bash
docker build -t your-registry/github-runner:custom -f Dockerfile.runner .
docker push your-registry/github-runner:custom

# Update stack to use custom image
```

### Step 17: Autoscaling Configuration

**Use Actions Runner Controller (ARC) for Kubernetes-style autoscaling:**

```yaml
# Add to stack
autoscaler:
  image: ghcr.io/actions-runner-controller/actions-runner-controller:latest
  environment:
    - GITHUB_TOKEN=${GITHUB_TOKEN}
    - GITHUB_APP_ID=${GITHUB_APP_ID}
    - GITHUB_APP_INSTALLATION_ID=${GITHUB_APP_INSTALLATION_ID}
    - GITHUB_APP_PRIVATE_KEY=/etc/gh-app-key/private-key.pem
  volumes:
    - /path/to/private-key.pem:/etc/gh-app-key/private-key.pem:ro
    - /var/run/docker.sock:/var/run/docker.sock
  deploy:
    mode: replicated
    replicas: 1
    placement:
      constraints:
        - node.role == manager
```

### Step 18: Multi-Repository Setup

For organization-level runners, update environment:
```bash
# For org-level
RUNNER_SCOPE=org
ORG_NAME=your-organization

# Remove REPO_URL, use:
ORG_URL=https://github.com/${ORG_NAME}
```

---

## Phase 8: Troubleshooting Guide

### Common Issues & Solutions

**1. Runner shows "Offline" in GitHub:**
```bash
# Check logs
docker service logs github-runners_runner-arm64

# Common causes:
# - Expired token (regenerate in GitHub settings)
# - Network connectivity issue
docker exec <container> curl -I https://github.com

# Restart service
docker service update --force github-runners_runner-arm64
```

**2. Docker-in-Docker not working:**
```bash
# Ensure privileged mode is enabled
# Check Docker socket is mounted
docker exec <container> docker ps

# If failing, check AppArmor/SELinux
sudo aa-status | grep docker
```

**3. Jobs stuck in "Queued":**
```bash
# Check if runners are picking up jobs
docker service ps github-runners_runner-arm64

# Verify labels match
docker exec <container> cat /home/runner/.runner | jq '.labels'
```

**4. Out of disk space:**
```bash
# Clean up Docker system
docker system prune -a --volumes

# Clean runner cache
docker volume rm github-runner-docker-cache
docker volume create github-runner-docker-cache
```

---

## Implementation Checklist

### Phase 1: Planning
- [ ] Determine which repositories need self-hosted runners
- [ ] Decide on runner count per architecture
- [ ] Generate GitHub Personal Access Token

### Phase 2: Infrastructure
- [ ] Create Docker network
- [ ] Create persistent volumes
- [ ] Verify node labels

### Phase 3: Deployment
- [ ] Create `.env` file with GitHub token
- [ ] Create `github-runners-stack.yml`
- [ ] Deploy stack to Docker Swarm
- [ ] Verify runners appear in GitHub UI

### Phase 4: Testing
- [ ] Create test workflow
- [ ] Run test on ARM64 runner
- [ ] Run test on x86_64 runner
- [ ] Verify Docker builds work
- [ ] Test access to homelab services

### Phase 5: Security
- [ ] Enable ephemeral mode
- [ ] Set resource limits
- [ ] Review and restrict permissions
- [ ] Set up network isolation

### Phase 6: Operations
- [ ] Create monitoring script
- [ ] Set up log rotation
- [ ] Create backup script
- [ ] Document maintenance procedures

---

## Cost & Resource Analysis

**Compared to GitHub-hosted runners:**

| Feature | GitHub Hosted | Your Self-Hosted |
|---------|---------------|------------------|
| Cost | $0.008/minute Linux | Free (electricity) |
| Minutes | 2,000/month free | Unlimited |
| ARM64 | Limited | Full control |
| Concurrency | 20 jobs | Unlimited |
| Network | Internet only | Your homelab access |

**Your Infrastructure Cost:**
- Existing hardware: $0 (already running)
- Electricity: ~$10-20/month additional load
- Time: Initial setup ~2-4 hours

---

## Next Steps

1. **Review this plan** and decide on your specific use cases
2. **Generate GitHub PAT** with `repo` and `admin:org` scopes
3. **Start with Phase 1** - Planning
4. **Deploy a single runner first** to test before scaling
5. **Iterate** based on your workflow needs

Would you like me to help you start with any specific phase, or do you have questions about the architecture? 🚀