AWS EC2 testing

Three time-boxed proof-of-concepts on real AWS EC2 (us-east-1), run to certify per-size cluster capacity, measure NVIDIA T4 transcode/CV performance, and validate a full S3 recording round-trip against real AWS S3. Total cost: ~$0.58 across all three. Full methodology, exact CLI commands, and raw results live in the repo at streamhub-docs/operations/AWS-POC.md; this page is the condensed how-to.

Spinning up StreamHub on EC2

Resolve the AMI live — don’t hardcode an AMI ID (they’re region- and rotation-specific):

AMI_ID=$(aws ssm get-parameters --region us-east-1 \
  --names /aws/service/canonical/ubuntu/server/24.04/stable/current/amd64/hvm/ebs-gp3/ami-id \
  --query 'Parameters[0].Value' --output text)

Security group — open the same ports the installer preflights: 22, 80, 443, 1935 (RTMP), 7880/7881 (LiveKit signaling), 8080 (WHIP), 7882/udp (WebRTC media). Keep 6379 (Redis, cluster coordination) private only — a self-referencing security group rule (source = the SG itself) is enough for nodes to reach each other’s Redis without ever exposing it publicly.
No Elastic IPs. Use the auto-assigned public IP (--associate-public-ip-address) — it dies with the instance, so nothing outlives teardown by accident.

Install with the published one-liner:

# origin — non-interactive, cluster-ready
curl -fsSL https://www.streamhub.studio/install.sh | sudo bash -s -- \
  --non-interactive --no-tls \
  --domain <origin-public-ip> \
  --cluster-redis-bind <origin-private-ip>

# each edge — join by token
curl -fsSL https://www.streamhub.studio/install.sh | sudo bash -s -- \
  --join --master-token <clt_...> \
  --master-ip <origin-private-ip> --master-url http://<origin-private-ip> \
  --node-name <edge-name>

See Quick install for every origin flag and Join a cluster for the day-1 edge flow in detail.

Cluster sizing — certified per instance size

Certified by actually loading each size to its ceiling (5-node cluster: 1 origin + t3.small/t3.medium/t3.large/c5.large edges), not read off the spec sheet:

Instance	vCPU / RAM	Reliable concurrent RTMP ingest	Room-composite HLS/recording?
`t3.small`	2 / 2 GB	~5–6 sessions; collapses at 12 (loadavg 35)	No
`t3.medium`, `t3.large`, `c5.large`	2 / 4–8 GB	not pushed to collapse	No — all 2 vCPU
`c5.xlarge`	4 / 8 GB	not the bottleneck	Yes — exactly 1 concurrent (~1.5 GB RSS, 3 of 4 vCPU)

Room-composite egress (HLS-live and Chrome-based recording) needs ≥4 vCPU — LiveKit egress refuses below that (minimumCpu: 4). None of the 2-vCPU edge sizes above can serve composite HLS or recording regardless of RAM; use track/track-composite (ffmpeg, no Chrome — see Capacity planning) for small edges instead. Pure WebRTC room-serving is cheap everywhere: LiveKit sat at ~5% CPU serving 10 viewers on a t3.small-hosted room, at 13–15 fps.

At the cluster level: 15 simultaneous streams placed correctly across all 5 nodes by LiveKit’s own load-based allocator; a hard-killed edge holding 7 rooms recovered playback in under 2 minutes — but publishers ingesting through that edge died and did not fail over (ingest is pinned to the node it opened the RTMP/WHIP session on).

GPU transcode — NVIDIA T4 (`g4dn.xlarge`, spot)

Stock Ubuntu 24.04 ffmpeg already ships h264_nvenc/hevc_nvenc/av1_nvenc once the driver is installed (nvidia-driver-580 from the stock repo). GPU passthrough to containers needs nvidia-container-toolkit + gpus: all (a ready-to-uncomment override in docker-compose.yml) — confirm with GET /api/v1/system/gpu (see the Transcoding / GPU section of server config).

Workload	CPU	GPU (T4)	Gain
1080p→720p transcode (single job)	2.12x realtime (libx264)	4.6x realtime (NVENC)	~2.2x
1080p→720p transcode, 10 concurrent jobs	—	each ≥1.67x realtime at 42% GPU util	ceiling was CPU-side software decode, not the GPU — extrapolated ~16 jobs with full on-GPU decode+encode
deface/CenterFace @640×360 (face blur)	52.1 ms/frame	12.0 ms/frame	~4.3x
YOLOv8n inference	59.4 ms/frame	8.9 ms/frame	~6.7x

CUDA execution was verified active (not silently falling back), and CPU fallback was verified graceful when CUDA is unavailable.

S3 recording round-trip

Validated against real AWS S3 with a bucket-scoped IAM user (inline policy naming only that bucket — never account root keys):

# scoped IAM: create user + inline policy limited to one bucket, then one access key
aws iam create-user --user-name streamhub-poc-s3
aws iam put-user-policy --user-name streamhub-poc-s3 \
  --policy-name streamhub-poc-bucket-only --policy-document file://policy.json
aws iam create-access-key --user-name streamhub-poc-s3

# point the app at it
curl -X PUT $BASE/apps/live/s3 -H "Authorization: Bearer $TOKEN" \
  -H 'Content-Type: application/json' \
  -d '{"provider":"aws","bucket":"<bucket>","region":"us-east-1","endpoint":"","key":"...","secret":"..."}'

RTMP publish → POST /recording/start → stop → VOD reaches ready → object present in the bucket → presigned URL returns 200 → valid MP4 (H.264 + AAC). GET /apps/:app/s3 never echoes the key/secret back. Full details on the S3 config schema in the s3 field reference in the per-app config.yaml docs.

Teardown discipline

Every PoC resource is tagged Project=streamhub-poc at creation, deleted in reverse order immediately after results are captured, and confirmed gone with an audit query — this is what keeps repeated EC2 testing cheap and leak-free:

aws ec2 terminate-instances --instance-ids <id...>      # DeleteOnTermination:true on the root volume
aws ec2 delete-security-group --group-id <sg-id>
aws ec2 delete-key-pair --key-name streamhub-poc
aws s3 rm s3://<bucket> --recursive && aws s3 rb s3://<bucket>
aws iam delete-access-key --user-name streamhub-poc-s3 --access-key-id <akid>
aws iam delete-user-policy --user-name streamhub-poc-s3 --policy-name streamhub-poc-bucket-only
aws iam delete-user --user-name streamhub-poc-s3

# audit — every query below must return empty
aws ec2 describe-instances --filters "Name=tag:Project,Values=streamhub-poc" \
  "Name=instance-state-name,Values=pending,running,stopping,stopped"
aws ec2 describe-security-groups --filters "Name=tag:Project,Values=streamhub-poc"
aws s3api list-buckets --query "Buckets[?starts_with(Name,'streamhub-poc')].Name"
aws iam list-users --query "Users[?starts_with(UserName,'streamhub-poc')].UserName"

No Elastic IPs were allocated in any of the three PoCs, so there’s no separate IP-release step.