System Design 101: How to Think About Scalable Systems
System design can feel overwhelming at first. There are dozens of technologies, patterns, and trade-offs to consider. But at its core, system design is about answering one question: How do you build something that works reliably at scale?
Here’s a framework for thinking through system design problems.
Start With Requirements
Before drawing any boxes, clarify what you’re building:
- Functional requirements — What does the system do? (e.g., “Users can post tweets and follow other users”)
- Non-functional requirements — How well does it need to perform? (e.g., “Support 100M daily active users with < 200ms latency”)
- Scale estimates — Back-of-the-envelope math. How many requests per second? How much storage per year?
Getting these right saves you from over-engineering or under-provisioning.
The Building Blocks
Every large-scale system is composed of a few fundamental components:
Load Balancers
Distribute incoming requests across multiple servers. Without them, a single server becomes a bottleneck and a single point of failure.
- Layer 4 (TCP) — fast, routes based on IP/port
- Layer 7 (HTTP) — smarter, can route based on URL path, headers, cookies
Application Servers
Stateless services that handle business logic. The key word is stateless — any server can handle any request. Session state goes in a cache or database, not in server memory.
Databases
The heart of most systems. Key decision: SQL vs NoSQL.
| Factor | SQL (PostgreSQL, MySQL) | NoSQL (MongoDB, Cassandra) |
|---|---|---|
| Schema | Rigid, structured | Flexible, schemaless |
| Scaling | Vertical (mostly) | Horizontal (built-in) |
| Consistency | Strong (ACID) | Eventual (BASE) |
| Best for | Relationships, transactions | High write throughput, flexible data |
In practice, many systems use both — SQL for transactional data, NoSQL for analytics or session storage.
Caching
The fastest way to improve performance. Cache frequently-accessed data in memory using Redis or Memcached.
Common strategies:
- Cache-aside — App checks cache first, falls back to DB, then populates cache
- Write-through — App writes to cache and DB simultaneously
- Write-behind — App writes to cache, which async-writes to DB
Cache invalidation is famously one of the hardest problems in CS. Keep TTLs reasonable and have a strategy for stale data.
Message Queues
Decouple producers and consumers with queues like Kafka or RabbitMQ. Instead of service A calling service B directly, A publishes a message, and B processes it asynchronously.
Use cases:
- Email/notification sending
- Order processing pipelines
- Event-driven architectures
- Log aggregation
Key Design Patterns
Database Sharding
Split data across multiple databases by a shard key (e.g., user ID). Each shard holds a subset of the data. This enables horizontal scaling of your database layer.
Read Replicas
Route read queries to replica databases, keeping the primary for writes. Works well when reads vastly outnumber writes (which is common — think Twitter’s timeline).
Rate Limiting
Protect your system from abuse. Common algorithms: token bucket, sliding window counter, leaky bucket. Implement at the API gateway level.
Circuit Breaker
If a downstream service is failing, stop calling it temporarily instead of cascading failures. Libraries like Resilience4j (Java) make this easy.
A Simple Design Template
When approaching any system design problem, follow this structure:
- Requirements — functional and non-functional
- High-level design — major components and their interactions
- Deep dives — database schema, API design, caching strategy
- Trade-offs — what you’d change at 10x or 100x scale
- Bottlenecks — identify single points of failure and how to mitigate
Keep Learning
System design is a skill that grows with experience. Start by studying real-world architectures — how does Netflix handle streaming? How does Uber match riders to drivers? How does Google index the web?
Every system you study adds patterns to your toolkit. The goal isn’t to memorize solutions — it’s to build intuition.