Data model
StreamHub persists with better-sqlite3 (synchronous, simple, predictable — no separate database server to run). The model is deliberately decentralized, AntMedia-style: a minimal global database holds only cross-cutting identity and cluster-routing data, and each app owns its own SQLite file holding all of its state. This is both a clean tenancy boundary and the foundation for clustering — an app’s state travels with the app.
Two tiers
Section titled “Two tiers”data/streamhub.db apps/<app>/app.db (global — cross-cutting only) (per app — everything app-scoped) ├─ tenants ├─ streams ├─ users ├─ vods (+ metatags, snapshot_key) ├─ memberships └─ ingress_auth (RTMP key/password) ├─ quotas ├─ api_tokens apps.name → points at the app's directory ├─ nodes (cluster registry) │ ├─ apps (pointer: name, tenant, node) ─────────────┘ └─ server_logs apps/<app>/{recordings,hls,snapshots,samples}/ apps/<app>/config.yaml- Global
data/streamhub.db— identity and routing only, kept small and hot. - Per-app
apps/<app>/app.db— the app’s config-adjacent state. Media artifacts (recordings/,hls/,snapshots/,samples/) and the app’sconfig.yamllive on disk underapps/<app>/; the database stores rows and pointers, not blobs.
Global schema (data/streamhub.db)
Section titled “Global schema (data/streamhub.db)”| Table | Purpose |
|---|---|
apps |
App registry / pointer: name (unique), display_name, livekit_room_prefix, tenant, node, settings_json, timestamps. The row is a pointer — the app’s real state lives in its own app.db. |
api_tokens |
Bearer tokens (sk_…): token_hash, scope (global | app), optional app_id, optional allowed_ips_json IP allowlist, last_used_at, revoked. |
tenants |
Top-level tenant / customer. |
users |
Dashboard identities. |
memberships |
user × tenant role bindings (RBAC). |
quotas |
Per-tenant limits: maxApps, maxConcurrentStreams, maxRecordingMinutesMonth, maxEgressGbMonth, maxStorageGb (-1 = unlimited). |
nodes |
Cluster registry: nodes that have joined (endpoint/IP, role, health). Unused on a single node — present so adding an edge is a data change, not a schema change. |
server_logs |
Structured server log sink (also written to rotating files). |
_streamhub_meta |
Migration bookkeeping — schema version, split-migration markers, backup path. |
Hot-path indices back the global registry so lookups stay fast under load. Column-add and tenancy-backfill migrations run idempotently on top of the base tables.
Per-app schema (apps/<app>/app.db)
Section titled “Per-app schema (apps/<app>/app.db)”| Table | Purpose |
|---|---|
streams |
Live/finished streams: stream_id (unique), type (webrtc | rtmp | rtsp | whip), room, participant, status (active | ended), timings, last_stats_json (viewer count, etc.). |
vods |
Recordings: stream_id, room, file_key, s3_url, public_url, size/duration/dimensions/format, status (recording | uploading | ready | failed), local_path, metatags_json, snapshot_key, timings. |
ingress_auth |
Per-app RTMP ingress credentials: stream key plus an optional password (feature-flagged). |
_streamhub_meta |
Per-app migration bookkeeping. |
DATA_DIR layout
Section titled “DATA_DIR layout”DATA_DIR is bind-mounted into both core and egress at the same path (see
Services), and holds everything that isn’t a database
row:
DATA_DIR/ streamhub.db apps/ <app>/ app.db recordings/ hls/ snapshots/ samples/ config.yaml secrets.json # per-app S3 credentials, referenced from each app's config.yamlPer-app S3 credentials live in data/secrets.json, not in the database — config.yaml
references them by key. This keeps credentials out of SQLite backups and query results.
The split migration (idempotent, with backup)
Section titled “The split migration (idempotent, with backup)”streams, vods and ingress_auth historically lived in the global database (or, further
back, a legacy apps/<app>/vods.db). The move to app.db is handled by an automatic,
idempotent migration that runs at core boot:
core boot (DbService.init) │ ▼per-app split already done? ──yes──▶ open handles, run pending column-adds, done │ no ▼VACUUM INTO backup of global DB → streamhub.db.bak-<timestamp> │ ▼for each app: create app.db, copy its rows from global streams/vods + legacy vods.db (if present) │ ▼record marker in _streamhub_meta: per_app_split_backup = <bak path> │ ▼open handles, run pending column-adds, doneProperties:
- Backup first. A single-file, consistent backup of the global database (
VACUUM INTO) is taken before anything is touched, saved next to it asstreamhub.db.bak-<timestamp>; the path is recorded in_streamhub_meta. - Idempotent. Re-running is safe — a marker in
_streamhub_metashort-circuits an already-migrated database;CREATE TABLE IF NOT EXISTSplus copy-if-absent throughout. - Non-destructive. Any legacy
apps/<app>/vods.dbis imported on first open of that app’sapp.dband left in place as a fallback backup; the global copies remain until a later cleanup. - Lazy per-app open.
app.dbhandles are opened on demand and cached (one handle per app); opening an app runs itsAPP_MIGRATIONSand any legacy import.
Maintenance (PRAGMA optimize → ANALYZE → REINDEX → VACUUM → wal_checkpoint(TRUNCATE))
is exposed per database via the db-admin endpoints.
Why per-app (cluster rationale)
Section titled “Why per-app (cluster rationale)”Because each app owns its full state in one file:
- Tenancy isolation is physical — an app is a directory you can copy, back up, or move.
- Cluster placement becomes trivial — to run an app on another node, move or replicate
apps/<app>/(database plus media refs); the global registry just updates the app→node pointer. - The global database stays minimal (identity plus routing), so it can remain the single small shared control-plane store even as the number of apps and nodes grows.
See Cluster for how this feeds the origin + edge design.