ProDG Mainframe — Architecture Document
Version: 1.0.0
Date: 2026-04-27
Author: Hermes Agent (prodg-mainframe deployment)
Classification: Internal — CEO/CIO Eyes Only
1. Executive Summary
The ProDG Mainframe is a self-hosted, containerized infrastructure platform serving as the primary orchestration and data layer for ProDG Studio operations. It runs on a Hetzner VPS (65.109.89.215) and provides:
Identity & Network: Self-hosted Headscale Tailnet with Caddy TLS termination
Secrets Management: Infisical (self-hosted) + Vaultwarden (password vault)
Data Layer: PostgreSQL 16, Redis 7, MinIO S3-compatible object storage
Observability: Prometheus, Grafana, Loki, Promtail, node-exporter, cadvisor
Agent Orchestration: Hermes API (FastAPI) with 3-tier trust model
Offsite Backups: Backblaze B2 via rclone, daily automated
2. Host Specifications
Attribute Value Host mainframe.prodg.studioPublic IP 65.109.89.215OS Ubuntu 24.04.3 LTS Kernel 6.8.0-100-genericDisk 436 GB RAID (19G used, 396G free) Docker Network compose_prodg-internal (172.19.0.0/16)Tailscale IP 100.64.0.1SSH Keys mitch-laptop (ed25519), Mainframe key
3. Service Inventory
3.1 Core Infrastructure
Service Container Image Internal Port Host Binding External Domain Caddy caddy caddy:280, 443, 2019 0.0.0.0All TLS domains Headscale headscale headscale/headscale:0.23.08080, 9090, 3478/udp 0.0.0.0headscale.prodg.studioPostgreSQL postgres postgres:16-alpine5432 127.0.0.1— Redis redis redis:7-alpine6379 127.0.0.1—
3.2 Application Services
Service Container Image Internal Port Host Binding External Domain Infisical infisical infisical/infisical:latest-postgres8080 127.0.0.1:8082secrets.mainframe.prodg.studioVaultwarden vaultwarden vaultwarden/server:latest80 127.0.0.1:8083vault.mainframe.prodg.studioMinIO minio minio/minio:latest9000, 9001 127.0.0.1s3.mainframe.prodg.studioHermes API hermes-api prodg/hermes-api:latest8000 127.0.0.1api.mainframe.prodg.studio
3.3 Observability Stack
Service Container Image Internal Port Host Binding External Domain Prometheus prometheus prom/prometheus:latest9090 — (internal only) — Grafana grafana grafana/grafana:latest3000 127.0.0.1metrics.mainframe.prodg.studioLoki loki grafana/loki:latest3100 — — Promtail promtail grafana/promtail:latest9080 — — node-exporter node-exporter prom/node-exporter:latest9100 — — cadvisor cadvisor gcr.io/cadvisor/cadvisor:latest8080 — —
4. Network Architecture
4.1 Public Ingress (Caddy)
Internet → Cloudflare DNS (grey cloud) → 65.109.89.215:80/443 → Caddy → Internal Docker Network
All public domains terminate TLS at Caddy using Let’s Encrypt ACME HTTP-01 challenges.
4.2 Tailnet (Headscale)
Tailscale Clients → headscale.prodg.studio:443 → Caddy → Headscale:8080
Control plane: HTTPS via Caddy reverse proxy
DERP/STUN: UDP 3478 direct from Headscale container
Metrics: HTTP 9090 (scraped by Prometheus internally)
4.3 Docker Internal Network
Network: compose_prodg-internal (bridge, 172.19.0.0/16)
Services communicate via container names as hostnames
Host-bound ports (127.0.0.1) are NOT exposed to the internet
Only Caddy (80, 443) and Headscale (8080, 9090, 3478) bind to 0.0.0.0
5. DNS Records (Cloudflare)
Record Type Value Proxy headscale.prodg.studioA 65.109.89.215DNS-only (☀️ grey) *.mainframe.prodg.studioA 65.109.89.215DNS-only (☀️ grey)
Critical: Orange-cloud (proxied) mode MUST remain disabled for all infrastructure records. Caddy requires direct IP reachability for ACME HTTP-01 challenges.
6. Certificate Management
Provider: Let’s Encrypt (staging for test, production for live)
Automation: Caddy handles issuance and renewal automatically
Storage: /opt/prodg/data/caddy/ (persistent volume)
Domains secured: All 7 public endpoints
7. Secrets Architecture
7.1 Current State (Transitional)
Secrets are stored in /opt/prodg/compose/.env (600 permissions, root-only). Planned migration to Infisical (Phase 9) will eliminate this file.
7.2 Key Secrets
Secret Location Purpose POSTGRES_PASSWORD.envDatabase authentication REDIS_PASSWORD.envRedis AUTH INFISICAL_ENCRYPTION_KEY.envInfisical data encryption INFISICAL_AUTH_SECRET.envInfisical session/JWT VAULTWARDEN_ADMIN_TOKEN.envVaultwarden admin panel MINIO_ROOT_PASSWORD.envMinIO root credentials GRAFANA_ADMIN_PASSWORD.envGrafana login TELEGRAM_BOT_TOKEN.envAlert notifications HERMES_API_TOKEN.envAPI authentication B2_KEY_SECRET.envBackblaze B2 application key
8. Backup Architecture
8.1 Backup Scope
Target Method Frequency Retention Destination PostgreSQL pg_dumpall + gzipDaily 03:00 UTC 30 days B2 MainframeBackup/postgres/ MinIO data tar archiveDaily 03:00 UTC 30 days B2 MainframeBackup/minio/ Compose configs tar archiveDaily 03:00 UTC 90 days B2 MainframeBackup/configs/
8.2 Automation
Tool: rclone v1.60.1
Scheduler: Hermes cron job (job_id: 44256d53266e)
Notification: Telegram group on success/failure
Log: /var/log/prodg-backup.log
9. Agent Trust Tiers
Tier Name Runtime Capabilities T1 Internal Host / Docker socket Full infra, orchestration, dispatch T2 Trusted On-box containers Research, safe inference T3 Untrusted Remote Tailscale nodes / Modal Burst inference, untrusted code
10. File System Layout
/opt/prodg/
├── backups/
│ ├── postgres/ # Local PG dumps
│ ├── scripts/
│ │ ├── backup-all.sh
│ │ ├── backup-postgres.sh
│ │ ├── backup-minio.sh
│ │ ├── backup-configs.sh
│ │ └── update-rclone-conf.sh
│ └── .phase*.env # Phase environment files (legacy)
├── compose/
│ ├── docker-compose.yml # Stack definition
│ ├── Caddyfile # Reverse proxy rules
│ ├── .env # Consolidated secrets
│ ├── prometheus/
│ │ └── prometheus.yml
│ ├── grafana/
│ │ └── provisioning/
│ │ ├── datasources/
│ │ │ ├── prometheus.yml
│ │ │ └── loki.yml
│ │ └── alerting/
│ │ ├── contactpoints.yml
│ │ ├── notificationpolicies.yml
│ │ └── rules.yml
│ ├── loki/
│ │ └── loki.yml
│ ├── promtail/
│ │ └── promtail.yml
│ └── headscale/
│ └── config/
│ └── config.yaml
├── data/
│ ├── caddy/ # TLS certificates
│ ├── caddy-config/
│ ├── grafana/ # Dashboards + SQLite
│ ├── loki/ # Log chunks + index
│ ├── minio/ # Object storage data
│ ├── postgres/ # PostgreSQL data
│ ├── redis/ # Redis AOF + RDB
│ └── vaultwarden/ # Password vault data
├── hermes-api/
│ ├── Dockerfile
│ ├── .dockerignore
│ └── app/
│ └── main.py # FastAPI orchestrator
└── scripts/
└── postgres-init/ # DB initialization scripts
11. Prometheus Scrape Targets (All UP)
Job Target Endpoint caddy caddy:2019 /metricscadvisor cadvisor:8080 /metricsgrafana grafana:3000 /metricsheadscale-metrics headscale:9090 /metricshermes-api hermes-api:8000 /metricsloki loki:3100 /metricsnode-exporter node-exporter:9100 /metricsprometheus localhost:9090 /metricspromtail promtail:9080 /metrics
12. Known Issues & Technical Debt
Grafana Caddyfile warning — Non-blocking: Caddyfile input is not formatted; run 'caddy fmt --overwrite'
Grafana plugin installer errors — Non-blocking: permission denied on bundled plugin dir
Caddy admin API on 0.0.0.0:2019 — Required for Prometheus scraping; mitigated by Docker network isolation
Infisical/Vaultwarden running as root — Should migrate to prodg service user (Phase 9)
Modal dispatch is a stub — Requires Modal SDK integration for production use
rclone.conf stored on host — Auto-generated from .env; will be migrated to Infisical secret injection
13. Firewall (UFW)
Status: active
To Action From
-- ------ ----
22/tcp ALLOW Anywhere
80/tcp ALLOW Anywhere
443/tcp ALLOW Anywhere
8080/tcp ALLOW Anywhere
Headscale ports 8080 (control) and 3478 (DERP) are directly exposed. Prometheus 9090 is NOT exposed to host.
Document Version: 1.0.0 — Generated by Hermes Agent