Scalability

Understanding horizontal vs vertical scaling, and how to design systems that handle growth gracefully.

3 min read2026-03-22easyscalabilityfundamentalssystem-design

What is Scalability?

Scalability is the ability of a system to handle increased load by adding resources. A scalable system can grow to accommodate more users, data, or transactions without sacrificing performance.

Key Insight

Scalability isn't just about handling more traffic — it's about doing so cost-effectively while maintaining reliability and performance.

Types of Scaling

Vertical Scaling (Scale Up)

Adding more power to an existing machine — more CPU, RAM, or storage.

AspectDetails
ProsSimple, no code changes needed
ConsHardware limits, single point of failure
Use WhenSmall-medium loads, databases

Horizontal Scaling (Scale Out)

Adding more machines to distribute the load.

AspectDetails
ProsNear-infinite scaling, fault tolerant
ConsComplex distributed systems, data consistency
Use WhenWeb servers, stateless services

Rule of Thumb

Prefer horizontal scaling for stateless services and vertical scaling for databases — until you need sharding.

Scaling Strategies

1. Stateless Architecture

Design services to be stateless so any instance can handle any request:

# Bad: Storing session in memory
class Server:
    sessions = {}  # Lost on restart/scale

# Good: External session store
class Server:
    def get_session(self, session_id):
        return redis.get(session_id)

2. Database Scaling

3. Caching

Add caching layers to reduce database load:

Client → CDN → App Cache (Redis) → Database

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton. Always have a clear invalidation strategy.

Metrics to Monitor

When thinking about scalability, track these key metrics:

  • Throughput: Requests per second the system handles
  • Latency: p50, p95, p99 response times
  • Error Rate: Percentage of failed requests
  • Resource Utilization: CPU, memory, disk, network usage

Common Interview Questions

Key Takeaway

Always start simple and scale incrementally. Premature optimization is the root of all evil — but know your scaling options ahead of time.

Comments