Skip to content

Distribution — scaling playback

Cluster covers scaling the media plane across nodes. This page covers a different problem: once a stream needs to reach thousands or hundreds of thousands of viewers, adding more WebRTC nodes stops being the answer. This is the reality check that shapes StreamHub’s distribution design.

Path Cost per viewer Cacheable by a CDN Latency Natural scale
WebRTC (SFU) High — each viewer is a subscription the SFU serves individually (crypto/pacing CPU + NIC) Never ~0.2 s hundreds–thousands per node
HLS served by the node Low CPU, but the NIC is the ceiling Yes (plain HTTP) 6–15 s thousands per node
HLS + CDN Near zero for the origin — the CDN pays for fan-out Yes 6–15 s (LL-HLS ~3 s, not available today) effectively unlimited, cost scales linearly with GB
HLS + P2P Viewers serve each other; origin/CDN covers the rest Yes, plus a viewer mesh HLS latency + 1–2 segments unlimited, cost scales sub-linearly

The number that drives the whole design: 100,000 viewers at 2.5 Mbps (720p) is 250 Gbps sustained — roughly 112 TB/hour. No single node, and no cluster reasonably sized for a small team, serves that directly. The only architecture that works at that scale is: SFU for the interactive core + cacheable HLS for the mass tail + CDN and/or P2P paying for the fan-out. That’s the same reality check as Cluster, applied here to configurable per-app modes.

Publisher ──▶ Origin SFU (LiveKit)
├──▶ WebRTC (~0.2s) ──▶ Interactive viewers (edges, moderate scale)
└──▶ egress HLS ──▶ /hls/<app>/<room>/index.m3u8
CDN (pull) ──▶ Mass viewers (10k–100k+)
│ P2P data channels
└──── 60–80% of the traffic between viewers

The three modes below — cdn, edge, p2p — are not mutually exclusive. hybrid (CDN plus P2P on top) is the mode for large events.

WebRTC viewer ceiling — the NIC, not the CPU

Section titled “WebRTC viewer ceiling — the NIC, not the CPU”

On the current single-node profile (8c/8GB):

Path Dominant limit 1 Gbps node 10 Gbps node (8c)
WebRTC, subscribers at 2.5 Mbps NIC first; CPU (crypto/pacing/SFU state) caps around 1–1.5k subs on 8 cores ~300–400 ~1,000–1,500 (CPU-bound)
WebRTC, subscribers at 1 Mbps (simulcast lower layer) same ~800 ~1,500
Static HLS at 2.5 Mbps NIC (CPU is trivial — it’s sendfile) ~320 ~3,500–4,000

Fan-out from origin to edges is cheap (one copy of the ladder per edge); the cost is edge → viewers. Scaling without a CDN means adding nodes linearly and balancing viewers across them. For HLS specifically, edges don’t need to run their own egress (each egress is a Chrome process, 1–2 cores) — the egress runs once, on the room’s origin node, and edges just proxy_cache the HTTP output. 10,000 viewers at 2.5 Mbps is 25 Gbps — roughly 3–4 well-placed 10G nodes, and only if the audience is regional. Multi-region without a CDN means a fleet plus geo-DNS plus on-call — not a small-team operation.

When plain edges are enough: CCTV/monitoring (tens of viewers, sub-second latency — WebRTC direct, works today), interactive classrooms/auctions/live-shopping up to a few thousand viewers, closed-network/sovereignty requirements where a CDN isn’t an option, or HLS up to roughly 2–5k regional viewers with 1–2 edges.

HLS + CDN (the default path past a few thousand viewers)

Section titled “HLS + CDN (the default path past a few thousand viewers)”

StreamHub already produces live HLS on disk with CDN-friendly headers: .m3u8 served no-cache, .ts segments served immutable with long cache, open CORS, at the public path /hls/<app>/<room>/index.m3u8. Because the playlist references segments with relative URIs, a CDN in pull mode in front of the host works with zero code changes — the player requests everything from the CDN’s domain and the relative URIs resolve against that same domain.

egress (Chrome → segments) → DATA_DIR/apps/<app>/hls
core :3020 (static mount, m3u8 no-cache, ts immutable)
origin pull ────┼──── origin pull ────┬──── origin pull
▼ ▼ ▼
CDN POP (US) CDN POP (EU) CDN POP (LATAM)
Viewers

Pull vs. push:

Pull (recommended default) Push (upload HLS to S3, CDN in front of the bucket)
Setup Create a pull-zone/distribution with origin = PUBLIC_BASE_URL. Zero code. Upload every segment to the bucket as it’s written. Needs new code.
Extra latency ~0 (CDN fetches on cache miss) +0.5–1 segment duration
Coupling to the node The node must stay up and reachable — it’s the origin Viewers decoupled from the node; it can go down and the CDN keeps serving the buffered window
When Default for live Critical 24/7 streams, a small-NIC origin, or multi-CDN

With Origin Shield and request collapsing (both offered by CloudFront/Fastly/Bunny), a single 8c/8GB node can act as origin for a 100k-viewer event — the origin serves kilobytes per second while the CDN carries the 250 Gbps.

CDN pricing (list price, 2026):

CDN $/GB (first tier) Request fees LL-HLS Notes
Bunny $0.005–0.01 none pass-through Best $/GB of the mainstream options; pull-zone in 5 minutes; Origin Shield. Default recommendation.
Cloudflare bundled flat plan none yes Cheap at low volume; video on basic plans has ToS caveats — use R2 + CDN for VOD (egress is $0).
CloudFront $0.085, tiers down to ~$0.02 at volume $0.01/10k HTTPS requests yes Natural fit with S3 (OAC) for VOD; HLS’s high request volume adds up — a 100k event can add $300–500/h in requests alone.
Fastly $0.12 $0.0075/10k yes, with strong request collapsing Most expensive on paper; its strength is programmable VCL config — overkill to start.

Cost of a 1-hour event (720p, ~2.5 Mbps ≈ 1.1 GB/viewer/hour, list prices):

Audience GB/h Bunny CloudFront (blended) CDN + P2P @70% (Bunny)
1,000 ~1.1 TB ~$6 ~$95 ~$2
10,000 ~11 TB ~$56 ~$900 ~$17
100,000 ~112 TB ~$560 ~$7,500–8,500 (+requests) ~$170

With a cluster in front: a single node is its own CDN origin today. Once nodes exist (see Cluster), the origin becomes the master, proxying /hls/<app>/<room>/* internally to whichever node owns that room (the router already knows the room→node affinity from the registry); a short proxy_cache on the master acts as an extra shield. The CDN never needs to know about the internal topology.

P2P — the lever for six-figure audiences

Section titled “P2P — the lever for six-figure audiences”

This is the piece that makes “hundreds of thousands of viewers without a proportional CDN bill” realistic, and StreamHub’s clearest differentiator against AntMedia (which has no P2P story). The approach follows Peer5, Streamroot and CDNBye/SwarmCloud, built on the mature open-source P2P Media Loader (Novage, Apache-2.0, TypeScript).

Each viewer downloads HLS segments from other viewers over WebRTC data channels, falling back to the CDN/origin only when the swarm doesn’t have a segment (the playlist itself is always fetched over HTTP). Peer discovery uses the WebTorrent tracker protocol over WebSocket.

Server side (light) Swarm per stream+rendition
┌─────────────────────────┐ peer ◀──▶ peer
│ Origin/CDN HLS │◀── ~20–40% ──── │ ╲ ╱
│ (fallback + playlists) │ of bytes │ ╲ ╱
│ WebTorrent tracker (wss) │◀── announce ────│ peer ◀──▶ peer
│ Public STUN │ / SDP offers │
└─────────────────────────┘

Honest numbers:

  • Bandwidth savings: the vendor claims up to 80%; the theoretical ceiling with 10-peer swarms is around 90%. In production, 50–75% is a defensible range for live with well-configured buffering; the top end needs large, homogeneous audiences. Small swarms (fewer than ~10 viewers) save close to nothing, so the HTTP fallback is mandatory and the mode is inherently hybrid.
  • Latency cost: P2P needs a buffer window for segments to circulate — roughly 1–2 segment durations on top of the base HLS latency. StreamHub’s current HLS (15s, tunable toward 6–8s) fits well; P2P is incompatible with sub-3s LL-HLS, so the two shouldn’t be pursued on the same stream.
  • Player: integrates with hls.js (and Shaka). StreamHub’s current public HLS player uses video.js/VHS, not hls.js — adopting P2P means moving that path to hls.js first.
  • Server: needs a self-hosted WebTorrent tracker (public trackers explicitly say “don’t use in production”) plus STUN (public STUN is fine — peers are in home/office browsers; TURN doesn’t apply to the mesh, a peer without UDP just falls back to HTTP).
  • Security: segments are shared between viewers, which is fine for public streams (HLS is public today). If signed/DRM playback ships later, P2P must be disabled for that app.
Use case Audience Mode Latency Cost Complexity
CCTV / monitoring / QC 1–20 edge (pure WebRTC) ~0.2 s already covered none — works today
Live-shopping / auction, 1→N 100–5k hybrid: WebRTC for host + VIPs, HLS+CDN for the rest 0.2 s / 6–15 s ~$6–60/h CDN low
Interactive class / webinar 50–500 edge (WebRTC, regional edge) ~0.2 s one extra node medium (live cluster)
Regional event 1k–10k cdn 6–15 s ~$6–56/h (Bunny) low
Mass event 50k–500k hybrid = cdn + p2p 10–20 s ~$170/h @100k (70% P2P) medium-high
24/7 radio / audio thousands cdn (audio is ~15x cheaper than video) or p2p 6–15 s tens of $/month low
Closed network / sovereignty variable edge (mandatory) 0.2 s / 6 s own hardware medium
VOD catalog long tail S3 + CDN (public_url), P2P for spikes n/a $/GB CDN or R2 ($0 egress) low

Simple rule: interactive → WebRTC/edges; massive → HLS/CDN; massive and recurring → add P2P. And from the cluster reality check: never WebRTC for the mass tail.

Phase What Effort Unlocks
F0 — CDN, no code Bunny pull-zone with origin = current host; validate playback via CDN 1–2 days 1k–10k events, today
F1 — distribution config Per-app YAML block, cdn.base_url wired into playlistUrl/playUrl//play + UI 3–5 days CDN self-service per app
F2 — HLS tuning Configurable segment_seconds/list_size on egress: 15s → ~6–8s latency 3–5 days Better live UX; groundwork for P2P
F3 — P2P Player moves to hls.js; p2p-media-loader-hlsjs; self-hosted tracker sidecar; kill-switch per app 2–3 weeks The 100k+ lever; 50–75% CDN savings
F4 — Edges serving Live multi-node LiveKit + HLS proxy_cache on edges + a viewer router 2–4 weeks Real edge mode; regional interactive WebRTC
F5 — 100k readiness Origin Shield, optional push-to-S3, tracker at Aquatic scale if needed, load testing, CDN volume contracts ongoing 100k+ events with known cost/risk

F0–F2 are roughly two weeks for one person and already unlock tens of thousands of viewers on CDN cost alone — the highest-value, lowest-effort slice. F3 (P2P) is the best next investment: one new container plus player work, with the risk concentrated in browser/mobile QA rather than backend complexity. F4 is the operationally expensive phase (fleet, on-call) and is not a prerequisite for mass scale — CDN + P2P get there without it; it’s justified by regional interactive latency and closed-network customers.

What not to do: WebRTC for the mass tail, a custom tracker on day one, LL-HLS and P2P on the same stream, or multi-CDN before there’s even one CDN in place.