Data model

StreamHub persists with better-sqlite3 (synchronous, simple, predictable — no separate database server to run). The model is deliberately decentralized, AntMedia-style: a minimal global database holds only cross-cutting identity and cluster-routing data, and each app owns its own SQLite file holding all of its state. This is both a clean tenancy boundary and the foundation for clustering — an app’s state travels with the app.

Two tiers

data/streamhub.db                          apps/<app>/app.db
  (global — cross-cutting only)              (per app — everything app-scoped)
  ├─ tenants                                 ├─ streams
  ├─ users                                   ├─ vods (+ metatags, snapshot_key)
  ├─ memberships                             └─ ingress_auth (RTMP key/password)
  ├─ quotas
  ├─ api_tokens                        apps.name → points at the app's directory
  ├─ nodes  (cluster registry)                        │
  ├─ apps   (pointer: name, tenant, node) ─────────────┘
  └─ server_logs
                                        apps/<app>/{recordings,hls,snapshots,samples}/
                                        apps/<app>/config.yaml

Global data/streamhub.db — identity and routing only, kept small and hot.
Per-app apps/<app>/app.db — the app’s config-adjacent state. Media artifacts (recordings/, hls/, snapshots/, samples/) and the app’s config.yaml live on disk under apps/<app>/; the database stores rows and pointers, not blobs.

Global schema (`data/streamhub.db`)

Table	Purpose
`apps`	App registry / pointer: `name` (unique), `display_name`, `livekit_room_prefix`, tenant, node, `settings_json`, timestamps. The row is a pointer — the app’s real state lives in its own `app.db`.
`api_tokens`	Bearer tokens (`sk_…`): `token_hash`, `scope` (`global` \| `app`), optional `app_id`, optional `allowed_ips_json` IP allowlist, `last_used_at`, `revoked`.
`tenants`	Top-level tenant / customer.
`users`	Dashboard identities.
`memberships`	user × tenant role bindings (RBAC).
`quotas`	Per-tenant limits: `maxApps`, `maxConcurrentStreams`, `maxRecordingMinutesMonth`, `maxEgressGbMonth`, `maxStorageGb` (`-1` = unlimited).
`nodes`	Cluster registry: nodes that have joined (endpoint/IP, role, health). Unused on a single node — present so adding an edge is a data change, not a schema change.
`server_logs`	Structured server log sink (also written to rotating files).
`_streamhub_meta`	Migration bookkeeping — schema version, split-migration markers, backup path.

Hot-path indices back the global registry so lookups stay fast under load. Column-add and tenancy-backfill migrations run idempotently on top of the base tables.

Per-app schema (`apps/<app>/app.db`)

Table	Purpose
`streams`	Live/finished streams: `stream_id` (unique), `type` (`webrtc` \| `rtmp` \| `rtsp` \| `whip`), `room`, `participant`, `status` (`active` \| `ended`), timings, `last_stats_json` (viewer count, etc.).
`vods`	Recordings: `stream_id`, `room`, `file_key`, `s3_url`, `public_url`, size/duration/dimensions/format, `status` (`recording` \| `uploading` \| `ready` \| `failed`), `local_path`, `metatags_json`, `snapshot_key`, timings.
`ingress_auth`	Per-app RTMP ingress credentials: stream key plus an optional password (feature-flagged).
`_streamhub_meta`	Per-app migration bookkeeping.

`DATA_DIR` layout

DATA_DIR is bind-mounted into both core and egress at the same path (see Services), and holds everything that isn’t a database row:

DATA_DIR/
  streamhub.db
  apps/
    <app>/
      app.db
      recordings/
      hls/
      snapshots/
      samples/
      config.yaml
  secrets.json          # per-app S3 credentials, referenced from each app's config.yaml

Per-app S3 credentials live in data/secrets.json, not in the database — config.yaml references them by key. This keeps credentials out of SQLite backups and query results.

The split migration (idempotent, with backup)

streams, vods and ingress_auth historically lived in the global database (or, further back, a legacy apps/<app>/vods.db). The move to app.db is handled by an automatic, idempotent migration that runs at core boot:

core boot (DbService.init)
  │
  ▼
per-app split already done? ──yes──▶ open handles, run pending column-adds, done
  │ no
  ▼
VACUUM INTO backup of global DB → streamhub.db.bak-<timestamp>
  │
  ▼
for each app: create app.db, copy its rows
  from global streams/vods + legacy vods.db (if present)
  │
  ▼
record marker in _streamhub_meta: per_app_split_backup = <bak path>
  │
  ▼
open handles, run pending column-adds, done

Properties:

Backup first. A single-file, consistent backup of the global database (VACUUM INTO) is taken before anything is touched, saved next to it as streamhub.db.bak-<timestamp>; the path is recorded in _streamhub_meta.
Idempotent. Re-running is safe — a marker in _streamhub_meta short-circuits an already-migrated database; CREATE TABLE IF NOT EXISTS plus copy-if-absent throughout.
Non-destructive. Any legacy apps/<app>/vods.db is imported on first open of that app’s app.db and left in place as a fallback backup; the global copies remain until a later cleanup.
Lazy per-app open. app.db handles are opened on demand and cached (one handle per app); opening an app runs its APP_MIGRATIONS and any legacy import.

Maintenance (PRAGMA optimize → ANALYZE → REINDEX → VACUUM → wal_checkpoint(TRUNCATE)) is exposed per database via the db-admin endpoints.

Why per-app (cluster rationale)

Because each app owns its full state in one file:

Tenancy isolation is physical — an app is a directory you can copy, back up, or move.
Cluster placement becomes trivial — to run an app on another node, move or replicate apps/<app>/ (database plus media refs); the global registry just updates the app→node pointer.
The global database stays minimal (identity plus routing), so it can remain the single small shared control-plane store even as the number of apps and nodes grows.

See Cluster for how this feeds the origin + edge design.