Runbook
This content is for the 1.0 version. Switch to the latest version for up-to-date documentation.
Commands below assume you’re on the host. Substitute your own domain for
streamhub.example.com and your own token for $STREAMHUB_API_TOKEN.
Start / stop / restart
Section titled “Start / stop / restart”Docker Compose
docker compose ps # status of all servicesdocker compose up -d # start / reconciledocker compose restart core # restart just the braindocker compose logs -f core # tail core logsdocker compose logs -f livekit egress # tail the media stackdocker compose down # stop all (keeps volumes/data)docker compose up -d --build # rebuild + restart after a code change.env changes are only fully re-read on down + up -d (a recreate), not on restart.
systemd (plain-server)
systemctl status streamhub-core livekitsystemctl restart streamhub-core # after a rebuild (npm run build)journalctl -u streamhub-core -fdocker restart ingress egress # media workers are containersHealth
Section titled “Health”# liveness (public, no auth)curl -s https://streamhub.example.com/api/v1/health
# server stats (auth): CPU/mem/disk, uptime, version, LiveKit reachability,# active streams/rooms, app count, egress/ingress statuscurl -s -H "Authorization: Bearer $STREAMHUB_API_TOKEN" \ https://streamhub.example.com/api/v1/stats
# is core actually up locally?curl -s http://127.0.0.1:3020/api/v1/healthIf /health is fine but the site 502s, the problem is the reverse proxy or TLS, not core. If
LiveKit shows unreachable in /stats, check the livekit service and redis.
Metrics
Section titled “Metrics”curl -s http://127.0.0.1:3020/metrics | grep streamhub_# with a token set (METRICS_TOKEN):curl -s -H "Authorization: Bearer $METRICS_TOKEN" http://127.0.0.1:3020/metrics/metrics lives at the root path (not under /api/v1) and is public unless
METRICS_TOKEN is set. Full catalog and Prometheus/Grafana setup: see
Observability.
Database health & optimize
Section titled “Database health & optimize”Per-app and global maintenance via the db-admin endpoints (auth required):
# health snapshot (page count, size, WAL, integrity) for an app DBcurl -s -H "Authorization: Bearer $STREAMHUB_API_TOKEN" \ https://streamhub.example.com/api/v1/apps/<app>/db/health
# optimize an app DB: PRAGMA optimize → ANALYZE → REINDEX → VACUUM →# wal_checkpoint(TRUNCATE). Returns before/after sizes (reclaimed bytes).curl -s -X POST -H "Authorization: Bearer $STREAMHUB_API_TOKEN" \ https://streamhub.example.com/api/v1/apps/<app>/db/optimize
# global DB health / optimizecurl -s -H "Authorization: Bearer $STREAMHUB_API_TOKEN" \ https://streamhub.example.com/api/v1/system/db/healthRun optimize after large deletes/purges to reclaim space and shrink the -wal file. It’s
online (no close/reopen needed) but VACUUM briefly locks — prefer a low-traffic window.
What’s on disk
Section titled “What’s on disk”Everything durable lives under DATA_DIR:
$DATA_DIR/ data/streamhub.db # global registry (+ .bak-<ts> from the split migration) data/secrets.json # S3 credentials (chmod 600) apps/<app>/app.db # per-app state apps/<app>/{recordings,hls,snapshots,samples}/ logs/ sdk/VODs already live in each app’s S3 bucket — local recordings/ are transient. For the actual
backup/restore procedure (which handles the SQLite files consistently), see
Backups.
Rollback (bad deploy)
Section titled “Rollback (bad deploy)”Docker Compose — redeploy the previous image/commit:
git checkout <previous-good-ref>docker compose up -d --builddocker compose logs -f coresystemd — check out the prior build, npm ci && npm run build, then
systemctl restart streamhub-core.
Migrations are forward-only but additive/idempotent (CREATE … IF NOT EXISTS, column
adds, copy-if-absent). A newer schema generally stays compatible with the prior code; if a new
build fails to boot, restore DATA_DIR from backup before starting the older build to be safe.
Common issues
Section titled “Common issues”| Symptom | Look at |
|---|---|
Site 502 but 127.0.0.1:3020/health ok |
reverse proxy / TLS (Caddy or nginx+certbot) |
| WebRTC connects then no media | 7882/udp firewall; STUN external-IP (host networking); Cloudflare proxying the domain (must be DNS-only) |
| RTMP push refused | 1935 firewall; ingress container up; stream key/password (ingress_auth) |
Recording never becomes ready |
egress container (headless Chrome, needs --shm-size); S3 creds in secrets.json; check streamhub_upload_queue_depth, streamhub_recording_failures_total |
| Callbacks not arriving | app callbacks.url/secret; check streamhub_callbacks_total{result="failed"} |
| DB file growing | run db/optimize (WAL checkpoint + VACUUM) |
| LiveKit not starting | check systemctl status livekit / docker compose logs livekit; confirm livekit.yaml keys match LIVEKIT_API_KEY/LIVEKIT_API_SECRET in .env; port 7880 (signaling) and 7882/udp (media) not already bound |
| Cert renewal failing (systemd path) | certbot.timer active (systemctl list-timers); certbot renew --dry-run; nginx config valid (nginx -t) before reload |