Design Twitter / X

System design for a Twitter-like social media platform handling tweets, timelines, and real-time updates at scale.

4 min read2026-03-22hardsystem-designtwittersocial-mediafan-out

Requirements

Functional Requirements

  1. Post tweets (text, images, videos)
  2. Follow/unfollow users
  3. Home timeline — see tweets from followed users
  4. User timeline — see a user's tweets
  5. Search tweets and users
  6. Like and retweet

Non-Functional Requirements

  • Scale: 500M users, 200M DAU
  • Availability: 99.99% uptime
  • Latency: Timeline loads < 200ms
  • Throughput: 600K tweets/sec at peak

Estimation

  • 200M DAU × 5 tweets/day = 1B tweets/day
  • Average tweet size: ~300 bytes
  • Storage: 1B × 300B = 300GB/day (text only)
  • With media: ~5TB/day

High-Level Architecture

                    ┌──────────────┐
  Clients ─────────→│   API Gateway  │
                    └──────┬───────┘
            ┌──────────────┼──────────────┐
            ▼              ▼              ▼
      ┌──────────┐  ┌──────────┐  ┌──────────┐
      │ Tweet    │  │ Timeline │  │ Search   │
      │ Service  │  │ Service  │  │ Service  │
      └────┬─────┘  └────┬─────┘  └────┬─────┘
           │              │              │
      ┌────▼─────┐  ┌────▼─────┐  ┌────▼─────┐
      │ Tweet DB │  │ Timeline │  │ Search   │
      │ (MySQL)  │  │ Cache    │  │ Index    │
      └──────────┘  │ (Redis)  │  │ (Elastic)│
                    └──────────┘  └──────────┘

The Fan-Out Problem

The core challenge is timeline generation. When you open Twitter, you see tweets from everyone you follow, sorted by time.

Fan-Out on Write (Push Model)

When a user tweets, push it to all followers' timelines.

Fan-Out on Read (Pull Model)

When a user opens their timeline, fetch tweets from all followed users at that moment.

Hybrid Approach (Twitter's Actual Design)

Best of Both Worlds

  • Normal users (< 10K followers): Fan-out on write
  • Celebrities (> 10K followers): Fan-out on read

When loading timeline, merge pre-computed timeline with live queries for celebrity tweets.

Data Storage

Tweet Storage (MySQL/PostgreSQL)

ColumnType
tweet_idBIGINT (Snowflake ID)
user_idBIGINT
contentVARCHAR(280)
media_urlsJSON
created_atTIMESTAMP

Sharded by user_id for even distribution.

Timeline Cache (Redis)

Each user's home timeline is a sorted set in Redis:

  • Key: timeline:{user_id}
  • Value: list of tweet IDs (last 800 tweets)
  • TTL: 7 days

Media Storage

  • Object Storage (S3) for images and videos
  • CDN for serving media globally with low latency

ID Generation

Snowflake IDs

Twitter created Snowflake for generating unique, time-sortable IDs across distributed systems. A 64-bit ID contains: timestamp (41 bits) + datacenter ID (5 bits) + machine ID (5 bits) + sequence number (12 bits).

Key Trade-offs

DecisionChoiceReasoning
Fan-out strategyHybridBalances write cost vs read latency
DatabaseMySQL + RedisProven at scale, Redis for fast timelines
SearchElasticsearchFull-text search, real-time indexing
Media storageS3 + CDNCost-effective, globally distributed
ID generationSnowflakeTime-sortable, no coordination needed

Don't Forget

In the interview, always discuss: rate limiting, spam detection, content moderation, and how you'd handle trending topics.

Comments