Distribution — scaling playback

This content is for the 1.0 version. Switch to the latest version for up-to-date documentation.

Cluster covers scaling the media plane across nodes. This page covers a different problem: once a stream needs to reach thousands or hundreds of thousands of viewers, adding more WebRTC nodes stops being the answer. This is the reality check that shapes StreamHub’s distribution design.

The physics of the problem

Path	Cost per viewer	Cacheable by a CDN	Latency	Natural scale
WebRTC (SFU)	High — each viewer is a subscription the SFU serves individually (crypto/pacing CPU + NIC)	Never	~0.2 s	hundreds–thousands per node
HLS served by the node	Low CPU, but the NIC is the ceiling	Yes (plain HTTP)	6–15 s	thousands per node
HLS + CDN	Near zero for the origin — the CDN pays for fan-out	Yes	6–15 s (LL-HLS ~3 s, not available today)	effectively unlimited, cost scales linearly with GB
HLS + P2P	Viewers serve each other; origin/CDN covers the rest	Yes, plus a viewer mesh	HLS latency + 1–2 segments	unlimited, cost scales sub-linearly

The number that drives the whole design: 100,000 viewers at 2.5 Mbps (720p) is 250 Gbps sustained — roughly 112 TB/hour. No single node, and no cluster reasonably sized for a small team, serves that directly. The only architecture that works at that scale is: SFU for the interactive core + cacheable HLS for the mass tail + CDN and/or P2P paying for the fan-out. That’s the same reality check as Cluster, applied here to configurable per-app modes.

Publisher ──▶ Origin SFU (LiveKit)
                 │
                 ├──▶ WebRTC (~0.2s) ──▶ Interactive viewers (edges, moderate scale)
                 │
                 └──▶ egress HLS ──▶ /hls/<app>/<room>/index.m3u8
                                        │
                                        ▼
                                     CDN (pull) ──▶ Mass viewers (10k–100k+)
                                        ▲
                                        │  P2P data channels
                                        └──── 60–80% of the traffic between viewers

The three modes below — cdn, edge, p2p — are not mutually exclusive. hybrid (CDN plus P2P on top) is the mode for large events.

WebRTC viewer ceiling — the NIC, not the CPU

On the current single-node profile (8c/8GB):

Path	Dominant limit	1 Gbps node	10 Gbps node (8c)
WebRTC, subscribers at 2.5 Mbps	NIC first; CPU (crypto/pacing/SFU state) caps around 1–1.5k subs on 8 cores	~300–400	~1,000–1,500 (CPU-bound)
WebRTC, subscribers at 1 Mbps (simulcast lower layer)	same	~800	~1,500
Static HLS at 2.5 Mbps	NIC (CPU is trivial — it’s `sendfile`)	~320	~3,500–4,000

Fan-out from origin to edges is cheap (one copy of the ladder per edge); the cost is edge → viewers. Scaling without a CDN means adding nodes linearly and balancing viewers across them. For HLS specifically, edges don’t need to run their own egress (each egress is a Chrome process, 1–2 cores) — the egress runs once, on the room’s origin node, and edges just proxy_cache the HTTP output. 10,000 viewers at 2.5 Mbps is 25 Gbps — roughly 3–4 well-placed 10G nodes, and only if the audience is regional. Multi-region without a CDN means a fleet plus geo-DNS plus on-call — not a small-team operation.

When plain edges are enough: CCTV/monitoring (tens of viewers, sub-second latency — WebRTC direct, works today), interactive classrooms/auctions/live-shopping up to a few thousand viewers, closed-network/sovereignty requirements where a CDN isn’t an option, or HLS up to roughly 2–5k regional viewers with 1–2 edges.

HLS + CDN (the default path past a few thousand viewers)

StreamHub already produces live HLS on disk with CDN-friendly headers: .m3u8 served no-cache, .ts segments served immutable with long cache, open CORS, at the public path /hls/<app>/<room>/index.m3u8. Because the playlist references segments with relative URIs, a CDN in pull mode in front of the host works with zero code changes — the player requests everything from the CDN’s domain and the relative URIs resolve against that same domain.

egress (Chrome → segments) → DATA_DIR/apps/<app>/hls
                                     │
                                     ▼
                     core :3020 (static mount, m3u8 no-cache, ts immutable)
                                     │
                     origin pull ────┼──── origin pull ────┬──── origin pull
                                     ▼                      ▼                ▼
                                CDN POP (US)          CDN POP (EU)     CDN POP (LATAM)
                                                        │
                                                        ▼
                                                     Viewers

Pull vs. push:

	Pull (recommended default)	Push (upload HLS to S3, CDN in front of the bucket)
Setup	Create a pull-zone/distribution with origin = `PUBLIC_BASE_URL`. Zero code.	Upload every segment to the bucket as it’s written. Needs new code.
Extra latency	~0 (CDN fetches on cache miss)	+0.5–1 segment duration
Coupling to the node	The node must stay up and reachable — it’s the origin	Viewers decoupled from the node; it can go down and the CDN keeps serving the buffered window
When	Default for live	Critical 24/7 streams, a small-NIC origin, or multi-CDN

With Origin Shield and request collapsing (both offered by CloudFront/Fastly/Bunny), a single 8c/8GB node can act as origin for a 100k-viewer event — the origin serves kilobytes per second while the CDN carries the 250 Gbps.

CDN pricing (list price, 2026):

CDN	$/GB (first tier)	Request fees	LL-HLS	Notes
Bunny	$0.005–0.01	none	pass-through	Best $/GB of the mainstream options; pull-zone in 5 minutes; Origin Shield. Default recommendation.
Cloudflare	bundled flat plan	none	yes	Cheap at low volume; video on basic plans has ToS caveats — use R2 + CDN for VOD (egress is $0).
CloudFront	$0.085, tiers down to ~$0.02 at volume	$0.01/10k HTTPS requests	yes	Natural fit with S3 (OAC) for VOD; HLS’s high request volume adds up — a 100k event can add $300–500/h in requests alone.
Fastly	$0.12	$0.0075/10k	yes, with strong request collapsing	Most expensive on paper; its strength is programmable VCL config — overkill to start.

Cost of a 1-hour event (720p, ~2.5 Mbps ≈ 1.1 GB/viewer/hour, list prices):

Audience	GB/h	Bunny	CloudFront (blended)	CDN + P2P @70% (Bunny)
1,000	~1.1 TB	~$6	~$95	~$2
10,000	~11 TB	~$56	~$900	~$17
100,000	~112 TB	~$560	~$7,500–8,500 (+requests)	~$170

With a cluster in front: a single node is its own CDN origin today. Once nodes exist (see Cluster), the origin becomes the master, proxying /hls/<app>/<room>/* internally to whichever node owns that room (the router already knows the room→node affinity from the registry); a short proxy_cache on the master acts as an extra shield. The CDN never needs to know about the internal topology.

P2P — the lever for six-figure audiences

This is the piece that makes “hundreds of thousands of viewers without a proportional CDN bill” realistic, and StreamHub’s clearest differentiator against AntMedia (which has no P2P story). The approach follows Peer5, Streamroot and CDNBye/SwarmCloud, built on the mature open-source P2P Media Loader (Novage, Apache-2.0, TypeScript).

Each viewer downloads HLS segments from other viewers over WebRTC data channels, falling back to the CDN/origin only when the swarm doesn’t have a segment (the playlist itself is always fetched over HTTP). Peer discovery uses the WebTorrent tracker protocol over WebSocket.

Server side (light)                         Swarm per stream+rendition
┌─────────────────────────┐                 peer ◀──▶ peer
│ Origin/CDN HLS           │◀── ~20–40% ──── │       ╲    ╱
│  (fallback + playlists)  │    of bytes     │        ╲  ╱
│ WebTorrent tracker (wss) │◀── announce ────│   peer ◀──▶ peer
│ Public STUN              │    / SDP offers │
└─────────────────────────┘

Honest numbers:

Bandwidth savings: the vendor claims up to 80%; the theoretical ceiling with 10-peer swarms is around 90%. In production, 50–75% is a defensible range for live with well-configured buffering; the top end needs large, homogeneous audiences. Small swarms (fewer than ~10 viewers) save close to nothing, so the HTTP fallback is mandatory and the mode is inherently hybrid.
Latency cost: P2P needs a buffer window for segments to circulate — roughly 1–2 segment durations on top of the base HLS latency. StreamHub’s current HLS (15s, tunable toward 6–8s) fits well; P2P is incompatible with sub-3s LL-HLS, so the two shouldn’t be pursued on the same stream.
Player: integrates with hls.js (and Shaka). StreamHub’s current public HLS player uses video.js/VHS, not hls.js — adopting P2P means moving that path to hls.js first.
Server: needs a self-hosted WebTorrent tracker (public trackers explicitly say “don’t use in production”) plus STUN (public STUN is fine — peers are in home/office browsers; TURN doesn’t apply to the mesh, a peer without UDP just falls back to HTTP).
Security: segments are shared between viewers, which is fine for public streams (HLS is public today). If signed/DRM playback ships later, P2P must be disabled for that app.

Decision matrix

Use case	Audience	Mode	Latency	Cost	Complexity
CCTV / monitoring / QC	1–20	`edge` (pure WebRTC)	~0.2 s	already covered	none — works today
Live-shopping / auction, 1→N	100–5k	`hybrid`: WebRTC for host + VIPs, HLS+CDN for the rest	0.2 s / 6–15 s	~$6–60/h CDN	low
Interactive class / webinar	50–500	`edge` (WebRTC, regional edge)	~0.2 s	one extra node	medium (live cluster)
Regional event	1k–10k	`cdn`	6–15 s	~$6–56/h (Bunny)	low
Mass event	50k–500k	`hybrid` = `cdn` + `p2p`	10–20 s	~$170/h @100k (70% P2P)	medium-high
24/7 radio / audio	thousands	`cdn` (audio is ~15x cheaper than video) or `p2p`	6–15 s	tens of $/month	low
Closed network / sovereignty	variable	`edge` (mandatory)	0.2 s / 6 s	own hardware	medium
VOD catalog	long tail	S3 + CDN (`public_url`), P2P for spikes	n/a	$/GB CDN or R2 ($0 egress)	low

Simple rule: interactive → WebRTC/edges; massive → HLS/CDN; massive and recurring → add P2P. And from the cluster reality check: never WebRTC for the mass tail.

Roadmap (honest effort, small team)

Phase	What	Effort	Unlocks
F0 — CDN, no code	Bunny pull-zone with origin = current host; validate playback via CDN	1–2 days	1k–10k events, today
F1 — `distribution` config	Per-app YAML block, `cdn.base_url` wired into `playlistUrl`/`playUrl`/`/play` + UI	3–5 days	CDN self-service per app
F2 — HLS tuning	Configurable `segment_seconds`/`list_size` on egress: 15s → ~6–8s latency	3–5 days	Better live UX; groundwork for P2P
F3 — P2P	Player moves to hls.js; `p2p-media-loader-hlsjs`; self-hosted tracker sidecar; kill-switch per app	2–3 weeks	The 100k+ lever; 50–75% CDN savings
F4 — Edges serving	Live multi-node LiveKit + HLS `proxy_cache` on edges + a viewer router	2–4 weeks	Real `edge` mode; regional interactive WebRTC
F5 — 100k readiness	Origin Shield, optional push-to-S3, tracker at Aquatic scale if needed, load testing, CDN volume contracts	ongoing	100k+ events with known cost/risk

F0–F2 are roughly two weeks for one person and already unlock tens of thousands of viewers on CDN cost alone — the highest-value, lowest-effort slice. F3 (P2P) is the best next investment: one new container plus player work, with the risk concentrated in browser/mobile QA rather than backend complexity. F4 is the operationally expensive phase (fleet, on-call) and is not a prerequisite for mass scale — CDN + P2P get there without it; it’s justified by regional interactive latency and closed-network customers.

What not to do: WebRTC for the mass tail, a custom tracker on day one, LL-HLS and P2P on the same stream, or multi-CDN before there’s even one CDN in place.