AWS EC2 testing
Three time-boxed proof-of-concepts on real AWS EC2 (us-east-1), run to certify per-size
cluster capacity, measure NVIDIA T4 transcode/CV performance, and validate a full S3 recording
round-trip against real AWS S3. Total cost: ~$0.58 across all three. Full methodology,
exact CLI commands, and raw results live in the repo at
streamhub-docs/operations/AWS-POC.md; this page is the condensed how-to.
Spinning up StreamHub on EC2
Section titled “Spinning up StreamHub on EC2”-
Resolve the AMI live — don’t hardcode an AMI ID (they’re region- and rotation-specific):
Terminal window AMI_ID=$(aws ssm get-parameters --region us-east-1 \--names /aws/service/canonical/ubuntu/server/24.04/stable/current/amd64/hvm/ebs-gp3/ami-id \--query 'Parameters[0].Value' --output text) -
Security group — open the same ports the installer preflights:
22,80,443,1935(RTMP),7880/7881(LiveKit signaling),8080(WHIP),7882/udp(WebRTC media). Keep6379(Redis, cluster coordination) private only — a self-referencing security group rule (source = the SG itself) is enough for nodes to reach each other’s Redis without ever exposing it publicly. -
No Elastic IPs. Use the auto-assigned public IP (
--associate-public-ip-address) — it dies with the instance, so nothing outlives teardown by accident. -
Install with the published one-liner:
Terminal window # origin — non-interactive, cluster-readycurl -fsSL https://www.streamhub.studio/install.sh | sudo bash -s -- \--non-interactive --no-tls \--domain <origin-public-ip> \--cluster-redis-bind <origin-private-ip># each edge — join by tokencurl -fsSL https://www.streamhub.studio/install.sh | sudo bash -s -- \--join --master-token <clt_...> \--master-ip <origin-private-ip> --master-url http://<origin-private-ip> \--node-name <edge-name>See Quick install for every origin flag and Join a cluster for the day-1 edge flow in detail.
Cluster sizing — certified per instance size
Section titled “Cluster sizing — certified per instance size”Certified by actually loading each size to its ceiling (5-node cluster: 1 origin +
t3.small/t3.medium/t3.large/c5.large edges), not read off the spec sheet:
| Instance | vCPU / RAM | Reliable concurrent RTMP ingest | Room-composite HLS/recording? |
|---|---|---|---|
t3.small |
2 / 2 GB | ~5–6 sessions; collapses at 12 (loadavg 35) | No |
t3.medium, t3.large, c5.large |
2 / 4–8 GB | not pushed to collapse | No — all 2 vCPU |
c5.xlarge |
4 / 8 GB | not the bottleneck | Yes — exactly 1 concurrent (~1.5 GB RSS, 3 of 4 vCPU) |
Room-composite egress (HLS-live and Chrome-based recording) needs ≥4 vCPU — LiveKit egress
refuses below that (minimumCpu: 4). None of the 2-vCPU edge sizes above can serve composite
HLS or recording regardless of RAM; use track/track-composite (ffmpeg, no Chrome — see
Capacity planning) for small edges instead. Pure WebRTC room-serving
is cheap everywhere: LiveKit sat at ~5% CPU serving 10 viewers on a t3.small-hosted room, at
13–15 fps.
At the cluster level: 15 simultaneous streams placed correctly across all 5 nodes by LiveKit’s own load-based allocator; a hard-killed edge holding 7 rooms recovered playback in under 2 minutes — but publishers ingesting through that edge died and did not fail over (ingest is pinned to the node it opened the RTMP/WHIP session on).
GPU transcode — NVIDIA T4 (g4dn.xlarge, spot)
Section titled “GPU transcode — NVIDIA T4 (g4dn.xlarge, spot)”Stock Ubuntu 24.04 ffmpeg already ships h264_nvenc/hevc_nvenc/av1_nvenc once the driver
is installed (nvidia-driver-580 from the stock repo). GPU passthrough to containers needs
nvidia-container-toolkit + gpus: all (a ready-to-uncomment override in docker-compose.yml)
— confirm with GET /api/v1/system/gpu (see the
Transcoding / GPU section of server config).
| Workload | CPU | GPU (T4) | Gain |
|---|---|---|---|
| 1080p→720p transcode (single job) | 2.12x realtime (libx264) | 4.6x realtime (NVENC) | ~2.2x |
| 1080p→720p transcode, 10 concurrent jobs | — | each ≥1.67x realtime at 42% GPU util | ceiling was CPU-side software decode, not the GPU — extrapolated ~16 jobs with full on-GPU decode+encode |
| deface/CenterFace @640×360 (face blur) | 52.1 ms/frame | 12.0 ms/frame | ~4.3x |
| YOLOv8n inference | 59.4 ms/frame | 8.9 ms/frame | ~6.7x |
CUDA execution was verified active (not silently falling back), and CPU fallback was verified graceful when CUDA is unavailable.
S3 recording round-trip
Section titled “S3 recording round-trip”Validated against real AWS S3 with a bucket-scoped IAM user (inline policy naming only that bucket — never account root keys):
# scoped IAM: create user + inline policy limited to one bucket, then one access keyaws iam create-user --user-name streamhub-poc-s3aws iam put-user-policy --user-name streamhub-poc-s3 \ --policy-name streamhub-poc-bucket-only --policy-document file://policy.jsonaws iam create-access-key --user-name streamhub-poc-s3
# point the app at itcurl -X PUT $BASE/apps/live/s3 -H "Authorization: Bearer $TOKEN" \ -H 'Content-Type: application/json' \ -d '{"provider":"aws","bucket":"<bucket>","region":"us-east-1","endpoint":"","key":"...","secret":"..."}'RTMP publish → POST /recording/start → stop → VOD reaches ready → object present in the
bucket → presigned URL returns 200 → valid MP4 (H.264 + AAC). GET /apps/:app/s3 never
echoes the key/secret back. Full details on the S3 config schema in
the s3 field reference in the per-app config.yaml docs.
Teardown discipline
Section titled “Teardown discipline”Every PoC resource is tagged Project=streamhub-poc at creation, deleted in reverse order
immediately after results are captured, and confirmed gone with an audit query — this is what
keeps repeated EC2 testing cheap and leak-free:
aws ec2 terminate-instances --instance-ids <id...> # DeleteOnTermination:true on the root volumeaws ec2 delete-security-group --group-id <sg-id>aws ec2 delete-key-pair --key-name streamhub-pocaws s3 rm s3://<bucket> --recursive && aws s3 rb s3://<bucket>aws iam delete-access-key --user-name streamhub-poc-s3 --access-key-id <akid>aws iam delete-user-policy --user-name streamhub-poc-s3 --policy-name streamhub-poc-bucket-onlyaws iam delete-user --user-name streamhub-poc-s3
# audit — every query below must return emptyaws ec2 describe-instances --filters "Name=tag:Project,Values=streamhub-poc" \ "Name=instance-state-name,Values=pending,running,stopping,stopped"aws ec2 describe-security-groups --filters "Name=tag:Project,Values=streamhub-poc"aws s3api list-buckets --query "Buckets[?starts_with(Name,'streamhub-poc')].Name"aws iam list-users --query "Users[?starts_with(UserName,'streamhub-poc')].UserName"No Elastic IPs were allocated in any of the three PoCs, so there’s no separate IP-release step.