Technical Architecture

SALAM ChatApp

A super-app combining chat, reels, livestream, stories, marketplace and crypto — designed for 500K+ users across Middle East & Southeast Asia.

💬 Chat + Calls 🎬 Reels 📖 Stories 📡 Livestream 🔒 E2EE Security 💰 Salam Pay 🤖 Native Android (Kotlin) 🍎 Native iOS (Swift)
0
Concurrent Users
0
Live Streams
0
Messages / sec
0
Microservices
What we're building

5 Core Modules

Each module is independently scalable — like having separate apps that share one identity and one login. Built as two platform-optimized native apps: Android in Kotlin and iOS in Swift, each tuned for maximum performance on its platform.

💬
Chat
Real-time 1-to-1 and group messaging for up to 256 members, with voice & video calls — all fully end-to-end encrypted. Built on WebSocket + WebRTC.
100K concurrent users
🎬
Reels
Short videos across 5 discovery tabs: Spotlight (nearby creators), Explore, Friends, Following, and AI-powered Suggested feed powered by machine learning.
50K concurrent viewers
📖
Stories
24-hour expiring posts with reactions, replies via chat, and viewer tracking. Exactly like Instagram/WhatsApp Stories — auto-deleted after 24 hours.
40K concurrent viewers
📡
Livestream
Live streaming with real-time gifts, viewer engagement, multi-camera switching, and fan monetization. Like TikTok Live — up to 1 million viewers per stream.
10K concurrent streams
📷
Camera Tools
AR filters, dual-camera (front + rear simultaneously), night mode, beauty effects, and face tracking — the creative engine powering all content on the platform.
On-device · Native SDK
📱 Native Mobile Apps — Android (Kotlin) + iOS (Swift) — Two platform-optimized codebases, each built with the native SDK. Android uses Kotlin + Jetpack Compose; iOS uses Swift + SwiftUI. Native apps deliver the lowest latency camera, on-device media processing, hardware-accelerated video, and the smoothest 60–120fps scrolling — a deliberate, production-scale mobile engineering choice for a media-heavy platform.
📱 Platform-Optimised Native Pipelines
Two independent native apps, each tuned to its platform's hardware and frameworks — converging on one shared backend, with the correct push channel per OS.
🤖Android · Kotlin
Jetpack Compose · CameraX · ExoPlayer · NDK
🎨UI LayerJetpack Compose
📷CaptureCameraX
🎞️Media EngineExoPlayer · NDK
🔌RealtimeOkHttp WebSocket
📦DistributionPlay Store · AAB
🍎iOS · Swift
SwiftUI · AVFoundation · Metal · CallKit
🎨UI LayerSwiftUI
📷CaptureAVFoundation
🎞️Media EngineAVPlayer · Metal
🔌RealtimeURLSession WebSocket
📦DistributionApp Store · TestFlight
⬇️ Shared Backend — API Gateway · WebSocket · Kafka · Redis · MongoDB
Both native clients speak the same authenticated, versioned API — one backend, two first-class front ends.
🤖
FCM — Android push
Firebase Cloud Messaging · data + notification
🍎
APNS — iOS push
Apple Push Notification service · token-based
Live Chat · Interactive

How Chat Works

When you tap "Send", your message travels through 8 steps in under 100 milliseconds — all encrypted, so not even we can read it.

✉️ A Message's Journey — Tap Send to Their Screen
📱 1
You Type
& tap Send
🚦 2
Gateway
Verifies identity
🔒 3
Encrypt
Only they can read
💾 4
Stored
Database saves it
🔴 5
Check Online?
Are they connected?
6
Deliver Live
Via open connection
🔔 7
Push Notify
If they're offline
8
Delivered
Double tick appears
🔒 End-to-End Encryption — Nobody Can Read Your Messages
Not even our engineers. The message is locked on your device before it leaves. Only the recipient's phone holds the key.
👤
Ahmed (Sender)
Private Key: on his phone only
Message is locked with Fatima's public key before it ever leaves his device. Only Fatima's private key can unlock it.
Locked with
Fatima's key
🔐→
x9Km#P2…Qr7!
🖥️
Our Servers
⚠️ We only store scrambled text. Zero ability to read messages — by design.
Only Fatima
can unlock
→🔓
x9Km#P2…Qr7!
👩
Fatima (Receiver)
Private Key: on her phone only
Her phone silently unlocks the message using her private key. Completely automatic — no user action needed.
⚡ How We Handle 100,000 People Chatting Simultaneously
Like a massive telephone switchboard — automated, instant, and split across 50 server pods.
🏢
50 Server Pods
Each pod holds 2,000 live connections — auto-scales to 500 pods under load
🔀
Redis Message Bus
Routes messages between pods instantly. 1 million events per second.
🍪
Sticky Sessions
Your native app always reconnects to the same pod for consistency
💗
Heartbeat Every 30s
Keeps connection alive, detects silent disconnects within 90 seconds
Live System · Interactive

Live Chat Simulation

A working model of our real-time backend. Type a message and watch it travel — encrypted — through the gateway, message queue, Redis fan-out and database, then arrive on the other device with live delivery receipts. No backend required; every event is simulated faithfully.

Connected
2
socket sessions
Messages/sec
0
throughput
Avg Latency
11ms
p50 round-trip
Queue Depth
0
Kafka backlog
Active Pods
4
auto-scaled
Delivered
0
since reset
Connected Devices — try typing & sending
🧑
Ahmed
online · pod-7
👩
Fatima
online · pod-3
🔒E2E Encrypted Delivery Receipts ⌨️Debounced Typing 🔁Auto-Retry
Event-Driven Backend — nodes light up in real time
📱ClientNative iOS / AndroidTLS 1.3
🚦API GatewayJWT verifyAUTH ✓
⚖️Load Balancer NGINX least-connROUTE
Socket.IO Pod WS frame parseRATE-LIM
🔀Redis Pub/Sub Cross-pod fan-outBROADCAST
📨Kafka Queue Async writesQUEUED
🍃MongoDB Sharded persistSAVED
🔎Elasticsearch Full-text indexINDEXED
📲Delivery Recipient socketPUSHED
📈Horizontal Scale 🧩Microservices 📨Message Queue 🛟Fault Tolerant
live-event-stream · chat-service.log
streaming
🧠 AI Enhancement Layer
Every message stream is observed by lightweight AI services running alongside the chat pipeline.
🛡️ AI Moderation
99.2% clean
Toxicity & abuse scoring per message
🚫 Spam Detection
0 flagged
Real-time pattern + URL analysis
📊 Anomaly Detection
Nominal
Traffic spike & fraud watch
💬 Support Assistant
Standby
Auto-reply & intent routing
🔐 Group Chat, Reactions & End-to-End Encryption
A full production messaging feature set — multi-user rooms with live presence, reactions, edit/delete, reply threads and a step-by-step view of how every message is encrypted on-device before it ever leaves the phone.
👥
Salam Builders · group
4 online · 12 members
🔄 synced
3 devices
😀Reactions
↩️Reply threads
✏️Edit message
🗑️Delete message
📌Pinned messages
📎Media & files
🎤Voice notes
✅✅Read receipts
📱Multi-device sync
🔁Retry failed
End-to-End Encryption (Signal-style)
📝
Plaintext
🔑
Session Key
🔒
AES-256-GCM
📡
Transfer
🔓
Decrypt
🟢 Recipient public key
🔑 Ephemeral session key
🔴 Private key (on-device)
Keys are generated per session via X3DH + Double Ratchet. The server only ever sees the ciphertext — it physically cannot read messages. Even we can't. Only the recipient's device holds the private key needed to decrypt.
Live System · Interactive

Voice & Video Calls

A faithful simulation of our WebRTC calling stack — signalling, ICE candidate gathering through STUN, TURN relay fallback, encrypted media and adaptive bitrate. Send an encrypted voice note, place a peer-to-peer call, and watch real-time packets flow with live connection-quality indicators.

RTT Latency
round-trip
Jitter
buffer variance
Packet Loss
0%
recovered via FEC
Bitrate
adaptive
Quality
Idle
MOS estimate
Codec
negotiated
Peer Connection — WebRTC
1080p
🧑
Ahmed
Caller · iOS (Swift)
camera off
1080p
👩
Fatima
Callee · Android (Kotlin)
camera off
🧊ICE — Connectivity Establishment
📍 STUN · public IP
🔀 ICE · srflx candidate
🛰️ TURN · relay fallback
Direct P2P is attempted first via STUN-discovered candidates. If a symmetric NAT blocks the path, media transparently falls back to an encrypted TURN relay — the call never drops.
Encrypted Voice Note
Tap “Send Encrypted Voice Note” to record →
📶Adaptive Media
resolution
target bitrate
frame rate
The encoder watches RTT, jitter and loss every RTCP cycle and steps resolution up/down (1080→720→480→240) to keep the call smooth — the same congestion control TikTok & WhatsApp use.
WebRTC Signalling Log
🔗WebRTC 📍STUN / ICE 🛰️TURN Relay Fallback 🔒DTLS-SRTP Encryption 🎚️Opus / VP8 / H.264 📶Adaptive Bitrate 🧮Jitter Buffer + FEC 🔁Auto-Reconnect
Live System · Interactive

Voice & Video Rooms

Discord-style live audio rooms and Zoom-style video conferencing, powered by an SFU (Selective Forwarding Unit). Each participant uploads one stream; the media server fans it out to everyone — so a 50-person room scales without melting anyone's phone. Try raising a hand, muting speakers, and adding participants.

Participants
0
in room
Active Speakers
0
detected
SFU Upstream
0
streams in
SFU Downstream
0
streams out
Egress
0
kbps fan-out
Media Servers
1
auto-scaled
Live Audio Room — “Salam Builders”
SFU Media Server
🛰️
Selective Forwarding Unit
routes 1 upstream → N downstreams
in 0 out 0 relays 0
Why an SFU, not a mesh? In a full-mesh call every device sends to every other device — N×(N-1) streams. With an SFU each device sends just one stream and the server forwards the right quality layer (simulcast) to each viewer based on their bandwidth. This is how rooms scale past a handful of people.
🛰️SFU Architecture 🎚️Simulcast Layers 🔊Active-Speaker Detection 🛰️TURN Relay
Room Synchronisation Log
Modules 5 · 6 · 7

Video, Stories & Livestream

Every video goes through a smart pipeline that compresses, formats and delivers it perfectly — no matter the device or internet speed.

🎬 From Your Camera to 50,000 Screens
The full upload journey — from recording in the native app to a viewer watching in HD.
📱
1. Record
Native Camera (CameraX / AVFoundation)
☁️
2. Upload
S3 Multipart (5MB chunks)
⚙️
3. Transcode
FFmpeg: 4 qualities
✂️
4. Segment
HLS: 4-second chunks
🌍
5. CDN Edge
Cloudflare / CloudFront
👀
6. Watch
Adaptive quality
💡 Why 4 resolutions? The native video player (ExoPlayer on Android, AVPlayer on iOS) auto-selects: 240p on slow WiFi, 480p on 4G, 720p on fast LTE, 1080p on broadband. Videos never buffer — this is called Adaptive Bitrate Streaming (ABR), the same technology Netflix uses.
📱 The 5 Reels Tabs — Each with Its Own Algorithm
Tap a tab to understand how each feed is powered differently.
📍 Spotlight
🔥 Explore
👥 Friends
➕ Following
✨ Suggested
📍
Spotlight — Local Creators Near You

Shows videos from creators within 50km of your location. A digital bulletin board for your neighbourhood. The native app shares your GPS coordinates, which query a geospatial database (PostGIS). Cache refreshes every 5 minutes.

PostGIS Geo Query 50km radius 5 min cache TTL
🔥
Explore — What's Trending Globally Right Now

Shows videos going viral — from creators you don't follow yet. Ranked by engagement speed in the last 60 minutes. If a video gets 10,000 likes in an hour, it rises here. Refreshes every 2 minutes. No personalisation — pure trending content.

Viral Velocity Score Refreshes every 2 min
👥
Friends — Your Mutual Connections' Videos

Only videos from people you've both connected with (mutual follow). Shows newest first — no algorithm. Pre-loaded into Redis cache when your friend posts, so the native feed scrolls instantly without a database query.

Chronological Redis cached per user
Following — Creators You Subscribe To

Videos from accounts you follow (one-directional). When a creator posts, their video ID is instantly pushed to every follower's feed list in Redis — under 10ms. Accounts with over 10K followers use a smarter pull-on-read system to prevent server overload.

Fan-out on Write 15 min TTL cache
Suggested — AI Learns What You Love

The "For You" page. An ML model studies your watch time, replays, skips and likes. Users who watch 80%+ of cooking videos get more cooking. Skipping under 3 seconds is a strong negative signal. Pre-computed every 15 minutes per user and served from Redis in under 50ms.

AI / ML Ranking LightGBM Model Pre-computed 15 min
📡 How Livestream Reaches 1 Million Viewers
Your phone → the internet → everyone's screen, in under 8 seconds.
📱 Streamer's Phone
📡 RTMP Ingest (SRS)
⚙️ FFmpeg Transcoder
☁️ S3 Storage
🌍 CDN Edge Servers
👁️ 1M+ Viewers
RTMP Protocol
Same broadcast protocol used by TV studios worldwide. Reliable and low-latency.
FFmpeg Transcoder
Converts live video into 3 quality levels every 2 seconds, simultaneously.
CDN Delivery
Edge servers in Dubai & Bahrain serve viewers. Origin is never overloaded.
End-to-End Latency
4–8 seconds HLS. Future LL-HLS upgrade brings it under 3 seconds.
Live System · Interactive

Reels Upload Architecture Simulation

Watch a short video travel through a production-grade, distributed pipeline — signed URLs, chunked multipart upload with retry, queue-based job dispatch, parallel transcoding across a worker pool, AI moderation, CDN distribution and sharded persistence — all visualised live.

Uploads/min
0
ingest rate
Queue Depth
0
SQS / Kafka jobs
Workers Busy
0/6
transcode pool
Transcode
0
frames/sec
CDN Hit Rate
97%
edge cache
Published
0
reels live
Client Upload — chunked & secure
🎬
reel_sunset_4k.mp4
42 MB · 0:28 · awaiting signed URL
Multipart upload (20 × 2 MB chunks)0%
🔑Signed Upload URL 🧩Chunked Multipart 🔁Per-Chunk Retry ⏱️Rate Limited
Distributed Worker Pool
Sharded Storage & Metadata DB
Multi-Region CDN Distribution
Async Processing Pipeline
🔑
Authorize & Sign
JWT validate → issue pre-signed S3 URL · 5 min TTL
idle
☁️
Chunked Upload → S3
20 parts · resumable · integrity checksum per chunk
idle
📨
Enqueue Job
Kafka topic video.process · partitioned by reelId
idle
🏷️
Metadata Extract
ffprobe → duration, codec, resolution, audio track
idle
⚙️
Transcode Ladder
FFmpeg → adaptive bitrate renditions (HLS)
240p
0.4 Mbps
480p
1.2 Mbps
720p
2.8 Mbps
1080p
5.0 Mbps
idle
🖼️
Thumbnail & Preview
Sprite sheet + animated WebP scrub preview
idle
🛡️
AI Moderation
Frame sampling → NSFW / violence / classification
idle
🌍
CDN Distribution
Push renditions to multi-region edge caches
idle
Persist & Publish
Sharded DB write → fan-out to recommendation feed
idle
👷Background Workers 🔀Distributed Processing 🛟Failover + Retry 🗄️DB Sharding
live-event-stream · media-pipeline.log
streaming
🧠 AI Enhancement Layer
Models run inline with the pipeline and on the published feed to keep content safe, discoverable and optimised.
🛡️ Content Moderation
Passed
NSFW / violence frame analysis
🏷️ Auto Classification
Topic tags & scene labels
✨ Recommendation
Indexed
Embedding → For-You ranking
📈 Predictive Scaling
+0 pods
Forecasts worker demand
🎚️ Smart Quality
Per-title
Optimises bitrate per scene
🚨 Anomaly Watch
Nominal
Abuse & spam-upload detection
🏛️ Enterprise-Grade Foundations — Built Into Every Flow
The same guarantees apply across chat and media: secure, observable, and resilient by default.
🔐JWT / Session Validation 🔒Encryption in Transit & at Rest ⏱️Rate Limiting 🚪API Gateway ⚖️Load Balancing 📈Auto-Scaling 📨Queue Systems 🧩Microservices 📊Monitoring & Observability 📜Centralised Logging 🛟Fault Tolerance 🌐Multi-Region DR 🗄️Caching Strategy
Live System · Interactive

Live Infrastructure

The beating heart of the backend, simulated in real time: an Apache Kafka event bus moving millions of events through partitioned topics, a Redis cluster serving hot data in microseconds, and a live operations dashboard streaming node-level activity — exactly what our on-call engineers watch.

Events/sec
0
Kafka throughput
Consumer Lag
0
unprocessed
Cache Hit Ratio
98.6%
Redis
Redis Ops/sec
0
in-memory
p99 Latency
7ms
end-to-end
Healthy Nodes
8 / 8
cluster
Kafka Event Bus — producers · topics · partitions · consumer groups
Producers
💬Chat Service
🎬Reels Service
🟢Presence Service
🔔API Gateway
Topics & Partitions
📨 chat.message.created
P0
P1
P2
🎞️ video.upload.completed
P0
P1
P2
👤 user.online.updated
P0
P1
P2
🔔 notification.dispatch.requested
P0
P1
P2
💀 Dead-Letter Queue · 0 events · auto-retry w/ backoff
Consumer Groups
📤delivery-svclag 0
🌐fanout-svclag 0
🔔notification-svclag 0
📊analytics-svclag 0
Redis — in-memory cache & pub/sub
📱
App Request
GET presence
Redis
microsecond lookup
🗄️
MongoDB
fallback on miss
cache hit98.6%
🔑Hot Keys (TTL)
presence:ahmed30s
unread:fatima
feed:nearby:geo300s
session:jwt:7f33600s
🧩Cluster (replicated · sharded)
M0primary
M1primary
M2primary
Rreplica
Presence, unread counters, sessions and geo-feeds live in RAM — that's why the app feels instant. On a node failure a replica is promoted automatically; cold data falls back to MongoDB and re-warms the cache.
Backend Operations — live node activity
API Gateway
0 rps
JWT validate · rate-limit
WebSocket
2 conns
socket fan-out
Redis
0 ops
cache · pub/sub
Kafka
0 ev
event bus
Media Worker
idle
transcode
Notification
0 push
FCM · APNS
🌊Kafka Event Streaming 🧱Partitioned Topics 👥Consumer Groups 💀Dead-Letter Queue Redis Cluster 📡Pub/Sub Fan-out 🔁Automatic Failover 📊Live Observability
Under the hood

Data Architecture

The schemas that power the platform — designed for scale from day one with proper indexes, foreign-key relationships, sharding and read/write separation. Tap any table to light up its relationships. PK = primary key · FK = foreign key · IDX = index.

💬 Chat System
🎬 Reels / Video
🟢 Presence
📡 Livestream
👤usersshard: user_id
PKiduuid
IDXphonevarchar
·display_nametext
·public_keybytea
·created_attimestamptz
🔗 1→N messages⚡ hash-sharded
🧵conversationsshard: conv_id
PKiduuid
·typeenum(dm,group)
·member_countint
IDXlast_msg_attimestamptz
🔗 1→N messages📌 256 members max
📨messagespart: monthly
PKiduuid
FKconversation_iduuid
FKsender_iduuid
·body_encryptedbytea
·reply_touuid?
IDXcreated_attimestamptz
🔗 conv + sender🗓️ time-partitioned
😀message_reactions
PKiduuid
FKmessage_iduuid
FKuser_iduuid
·emojivarchar(8)
🔗 message + user
📎attachments
PKiduuid
FKmessage_iduuid
·cdn_urltext
·mime / sizevarchar / bigint
🔗 message☁️ S3 + CDN
delivery_status
PKiduuid
FKmessage_iduuid
FKuser_iduuid
IDXstatesent·delivered·read
🔗 message + user
✍️Write Path
All writes go to the primary of the relevant shard. Hot tables (messages, video_views, comments) are time-partitioned so old partitions can be archived cheaply. Writes also publish a Kafka event for async fan-out.
📖Read Path
Reads are served from replicas and the Redis cache first. Presence, unread counts and feeds almost never touch Postgres/Mongo. This read/write split lets each side scale independently.
🧩Sharding
Large tables are hash-sharded by their natural key (user_id, conversation_id, video_id, stream_id) so a single hot user or viral video never overloads one node — the load spreads evenly across the cluster.
The Engine Room

Infrastructure & Scalability

The invisible foundation everything runs on. Each layer is independently scalable — adding more users means adding more pods, not rebuilding the system.

🏗️ Full System Architecture — Every Layer Connected
Hover any component to learn what it does in plain English.
📱 Client Layer — What Users Touch
iOS App (Swift)
Android App (Kotlin)
Admin Web Panel (React)
🚦 Gateway Layer — Traffic Control
Kong API Gateway
NGINX Load Balancer
WebSocket (Socket.IO)
WebRTC (Calls)
⚙️ Backend Services — The Brains (NestJS / Node.js)
Auth Service
Chat Service
Presence Service
Media Service
Reels Service
Livestream Service
Notification Service
Recommendation Service
Moderation Service
📨 Event Bus — The Nervous System
Apache Kafka (Event Bus)
BullMQ Job Queue
FCM + APNS (Push)
🗄️ Data Layer — Where Everything Is Stored
MongoDB (Messages)
PostgreSQL (Structured Data)
Redis Cluster (Speed Layer)
Elasticsearch (Search)
Amazon S3 (Media Storage)
🌍 Delivery Layer — Getting Content to Users Fast
Cloudflare (Middle East)
CloudFront (SEA Region)
SRS (Livestream Server)
AWS Route53 (DNS Routing)
💡 Hover any component for a plain-English explanation
⚡ The 3-Layer Speed System — Why the App Feels Instant
Like checking your pocket before the drawer before the warehouse. Always use the fastest source first.
L1
App Memory (In-Process)
Login keys, feature flags, static config — stored inside each server's RAM. Zero network hops. Shared via in-process LRU cache.
< 0.1ms
L2
Redis Cluster (Shared Memory)
Online status, unread counts, pre-computed feeds, session tokens, rate limits. Shared across all server pods. 6-node cluster — if one node fails, others take over automatically.
< 1ms
L3
CDN Edge (Cloudflare / CloudFront)
Videos, images, HLS segments, thumbnails. Cached at 300+ global edge locations. The server in Dubai serves Middle East users — not a server in Singapore.
< 30ms
Trust & Safety

Security & Compliance

Every message, user, and piece of content passes through multiple safety layers. We comply with Middle East data laws — and can't read your messages even if we wanted to.

🛡️ User Trust Score — Detecting Bad Actors Automatically
Every account starts at 50/100. Good behaviour raises it. Bad behaviour lowers it. The score controls what you can do — invisibly.
✅ Score Increases
Phone verified +20
Profile complete +10
Account age +0.1/day
KYC verified +20
❌ Score Decreases
Reports confirmed −10
AI flags content −5
Failed payments −10
Spam behavior max −25
🔐 Score Effects
0–20: Read-only, no posts
21–40: Limited messaging
41–70: Full features
71–100: Boosted reach + unlocks
🤖 AI Moderation
All content auto-scanned
Arabic NLP model built-in
AWS Rekognition for images
Real-time livestream frames
⚖️ Legal Compliance — Built Into the Architecture
We don't bolt compliance on later. It's designed in from day one.
🇸🇦 Saudi PDPL — User Data Must Stay in the Region

Saudi Arabia's Personal Data Protection Law requires that data about Saudi users stays within Saudi Arabia or countries with equivalent protection. We deploy to AWS Bahrain (me-south-1) — the nearest compliant data centre. User data never leaves the region. AWS Saudi Arabia is launching soon and we'll migrate automatically.

🔏 Right to Erasure — "Delete My Account" Really Deletes Everything

When a user requests deletion, we immediately hide the account (soft-delete) then run a background job to permanently erase all their data across every database within 30 days. For encrypted messages, deleting the encryption key makes ciphertext permanently unreadable — no need to hunt down every message byte across every shard.

📋 Audit Logs — Every Admin Action is Permanently Recorded

Every time an admin views user data, moderates content, or changes system settings, it's written to an append-only immutable audit log stored in Kafka and archived to S3. These logs cannot be altered — not even by our engineers. Required for ISO 27001 and SOC 2 Type II certification in Phase 3.

🌍 Region Strategy — Singapore vs Bahrain Decision

Current Phase 1 infrastructure is in AWS Singapore (lower cost, simpler). If the primary user base is Middle East, we must deploy to AWS Bahrain before launch — the round-trip from Saudi Arabia to Singapore is 150–200ms, which noticeably degrades real-time chat quality. Phase 2 adds Bahrain with active-active routing: Middle East users automatically connect to Bahrain, Southeast Asia users to Singapore.

Execution Plan

Implementation Roadmap

Three phases over 24 months — from working native Android + iOS MVPs to a 500K user platform with full ML, compliance and global CDN.

Phase 1 — Months 0–4
Build the Foundation (MVP)
Native Android (Kotlin) + iOS (Swift) apps — login via phone OTP, Google, QR code, user profiles
1-to-1 chat + group chat (50 members) + Firebase push notifications
Video upload pipeline (S3 + FFmpeg) + basic Following tab only
24-hour Stories creation, viewing and auto-expiry
Livestream MVP: RTMP ingest + HLS delivery + basic viewer chat
End-to-end encryption (Signal Protocol) for all chat messages
🎯 1K concurrent users · 500 streams · AWS Singapore monolith
Phase 2 — Months 4–12
Scale Up & Expand to Middle East
Extract Chat & Presence into independent microservices + Apache Kafka event bus
Launch all 5 Reels tabs including AI-powered Suggested feed
Deploy AWS Bahrain region + Cloudflare CDN for Middle East users
Scale group chat to 256 members
Livestream gifts with Salam Pay wallet integration
Full AI content moderation pipeline (Arabic NLP + AWS Rekognition)
🎯 50K concurrent users · 5K streams · Multi-CDN active
Phase 3 — Months 12–24
Production-Grade at Global Scale
Full microservices with Istio service mesh for automatic traffic management
Advanced LightGBM ML recommendation + A/B testing framework
LL-HLS: reduce livestream latency from 8s to under 3s globally
ISO 27001 + SOC 2 Type II compliance certifications
GPU-accelerated AR filters and premium creator tools
DRM content protection for paid livestream events
🎯 500K concurrent users · 50K streams · p99 latency <150ms
⚠️ Top Risks & How We Prevent Them
Known risks with concrete mitigation strategies — not discovered later, addressed now.
1
AWS Singapore too far from Saudi Arabia — 150–200ms latency on real-time chat
💡 Deploy to AWS Bahrain before Middle East marketing launch. Route53 latency routing sends ME users to Bahrain, SEA users to Singapore automatically.
CRITICAL
2
User loses their device → encryption keys lost → all messages permanently unreadable
💡 Encrypted key backup to cloud storage (same approach as WhatsApp's Google Drive backup), protected by a user-chosen passphrase we never see.
CRITICAL
3
Video goes viral — 50,000 simultaneous uploads overwhelm transcoding queue
💡 Kubernetes autoscaler watches queue depth metric. Automatically spins up AWS spot-instance worker pods (80% cheaper) within 2 minutes of queue spike.
HIGH
4
Celebrity with 5 million followers posts — fan-out to 5M feeds causes server overload
💡 Accounts above 10K followers use "fan-out on read" — we don't pre-push to 5M feeds. Followers fetch the content themselves when they open the app. Same strategy Instagram uses.
HIGH
5
Arabic content moderation is inaccurate — harmful content reaches users
💡 Train a custom Arabic NLP model on regional dialect data. Use human review queue for uncertain cases. OpenAI Moderation API as secondary verification layer.
MEDIUM
📊 Performance Commitments (SLOs)
Measurable targets we commit to — with automated alerts when they're breached.
<200ms
Message delivery (online)
Chat Module
<5s
Push notification (offline)
Chat Module
99.9%
Chat uptime / month
≤44 min downtime
<1.5s
Video start time (p75)
Reels Module
<150ms
Feed API response
Reels Module
<8s
Stream viewer latency
Livestream (HLS)