Sharding

Sharding is a horizontal scaling strategy that distributes data across multiple independent database servers (shards). Each shard holds a subset of the total data and operates as an autonomous database. Unlike partitioning (which splits data within a single instance), sharding splits data across machines, enabling virtually unlimited growth in storage and throughput.

Why Shard

A single server has hit its CPU, memory, or disk ceiling (vertical scaling limit reached)
Write throughput exceeds what one machine can handle
Data must be geographically distributed for latency or compliance (GDPR, data residency)
Read/write ratio demands more parallelism than replicas alone can provide

Horizontal Sharding

The most common form. Rows are distributed across shards based on a shard key. Each shard has the same schema but holds different rows.

graph TD
    App[Application] --> Router[Shard Router / Proxy]
    Router --> S1[(Shard 1<br/>users A-F)]
    Router --> S2[(Shard 2<br/>users G-M)]
    Router --> S3[(Shard 3<br/>users N-S)]
    Router --> S4[(Shard 4<br/>users T-Z)]

    style S1 fill:#e1f5fe
    style S2 fill:#e8f5e9
    style S3 fill:#fff3e0
    style S4 fill:#fce4ec

Shard Key Selection

The shard key is the single most critical design decision. It determines which shard holds each row.

Requirement	Why It Matters
High cardinality	Many distinct values → even distribution across shards
Query alignment	Most queries should target a single shard (avoid scatter-gather)
Write distribution	Avoid hotspots where one shard gets disproportionate writes
Stability	Shard key should rarely change (changing it means moving the row)

Good shard keys: tenant_id (multi-tenant SaaS), user_id (social apps), region (geo-distributed)

Bad shard keys: created_at (all new writes hit the latest shard), status (low cardinality), auto-increment id (sequential writes create hotspots)

Horizontal Sharding Strategies

Range-Based Sharding

Rows are assigned to shards by key ranges.

graph LR
    subgraph Shard_1["Shard 1"]
        R1["user_id 1 – 1,000,000"]
    end
    subgraph Shard_2["Shard 2"]
        R2["user_id 1,000,001 – 2,000,000"]
    end
    subgraph Shard_3["Shard 3"]
        R3["user_id 2,000,001 – 3,000,000"]
    end

Pros	Cons
Range queries stay on one shard	Uneven distribution if data isn’t uniform
Simple to understand and implement	New data clusters on the “latest” shard (hotspot)
Easy rebalancing by splitting ranges	Requires manual range management

Hash-Based Sharding

shard_id = hash(shard_key) mod N

graph TD
    K1["user_id = 42"] -->|"hash(42) mod 4 = 2"| S3["Shard 2"]
    K2["user_id = 99"] -->|"hash(99) mod 4 = 3"| S4["Shard 3"]
    K3["user_id = 7"] -->|"hash(7) mod 4 = 3"| S4
    K4["user_id = 200"] -->|"hash(200) mod 4 = 0"| S1["Shard 0"]

Pros	Cons
Even distribution regardless of key patterns	Range queries must hit all shards
No hotspots from sequential keys	Adding/removing shards remaps almost all keys (unless consistent hashing)
Simple routing logic	Harder to reason about data locality

Directory-Based Sharding

A centralized Lookup Table maps each key (or key range) to a specific shard.

graph TD
    App[Application] --> Dir[(Shard Directory<br/>key → shard mapping)]
    Dir --> S1[(Shard 1)]
    Dir --> S2[(Shard 2)]
    Dir --> S3[(Shard 3)]

Pros	Cons
Maximum flexibility — any key can go anywhere	Directory is a single point of failure
Easy rebalancing (update the directory)	Extra lookup on every query (latency)
Supports complex placement policies	Directory itself must be highly available

Geographic Sharding

Shard by user region for data locality and compliance.

graph TD
    subgraph US["US Region"]
        US_DB[(US Shard<br/>us-east-1)]
    end
    subgraph EU["EU Region"]
        EU_DB[(EU Shard<br/>eu-west-1)]
    end
    subgraph APAC["APAC Region"]
        APAC_DB[(APAC Shard<br/>ap-southeast-1)]
    end
    US_User[US User] --> US_DB
    EU_User[EU User] --> EU_DB
    APAC_User[APAC User] --> APAC_DB

Ideal for GDPR compliance (EU data stays in EU) and reducing latency to end users.

Vertical Sharding

Instead of splitting rows, vertical sharding splits tables (or groups of tables) across different servers. Each shard has a different schema serving different functional areas.

graph TD
    App[Application] --> Auth[(Auth Shard<br/>users, sessions, roles)]
    App --> Orders[(Orders Shard<br/>orders, order_items, payments)]
    App --> Catalog[(Catalog Shard<br/>products, categories, reviews)]
    App --> Analytics[(Analytics Shard<br/>events, metrics, logs)]

    style Auth fill:#e1f5fe
    style Orders fill:#e8f5e9
    style Catalog fill:#fff3e0
    style Analytics fill:#fce4ec

When Vertical Sharding Makes Sense

Functional decomposition — Each microservice owns its own data store
Different access patterns — Auth needs fast reads, Analytics needs bulk writes
Different DBMS per shard — Auth on PostgreSQL, Catalog on MongoDB, Analytics on ClickHouse
Team ownership — Each team manages its own shard independently

Vertical + Horizontal Combined

Real-world systems often combine both. First split by domain (vertical), then shard large tables within each domain (horizontal):

graph TD
    App[Application Layer] --> Auth[Auth Service]
    App --> Orders[Orders Service]

    Auth --> Auth_DB[(Auth DB<br/>single instance)]

    Orders --> Router[Shard Router]
    Router --> OS1[(Orders Shard 1<br/>tenant A-M)]
    Router --> OS2[(Orders Shard 2<br/>tenant N-Z)]

Consistent Hashing

Standard modular hashing (hash(key) mod N) breaks when N changes — nearly every key gets remapped. Consistent hashing solves this by mapping both nodes and keys onto a hash ring.

graph TD
    subgraph Hash_Ring["Hash Ring (0 — 2^32)"]
        direction TB
        N1["Node A<br/>position 1000"]
        N2["Node B<br/>position 4000"]
        N3["Node C<br/>position 7000"]
    end

    K1["Key X<br/>hash = 2500"] -.->|"next node clockwise"| N2
    K2["Key Y<br/>hash = 5500"] -.->|"next node clockwise"| N3
    K3["Key Z<br/>hash = 8500"] -.->|"wraps around"| N1

When a node is added, only keys between the new node and its predecessor are remapped (~1/N of total keys). When a node is removed, its keys shift to the next node clockwise. Used by Cassandra, DynamoDB, Redis Cluster, and Amazon S3.

Virtual nodes further improve balance — each physical node owns multiple positions on the ring, smoothing out distribution.

Implementation Phases

A structured approach to deploying sharding:

graph LR
    A[1. Data Modeling] --> B[2. Shard Creation]
    B --> C[3. Data Migration]
    C --> D[4. Query Routing]
    D --> E[5. Shard Management]
    E --> F[6. Monitoring]

Phase	Activities
1. Data Modeling	Choose shard key based on access patterns, cardinality, write distribution
2. Shard Creation	Provision physical/virtual/container databases per shard
3. Data Migration	Scripts and tools to redistribute existing data across shards
4. Query Routing	Deploy middleware/proxy layer that routes queries to the correct shard
5. Shard Management	Add/remove shards, rebalance data, handle shard failures
6. Monitoring	Track uneven shard sizes, high latencies, hardware failures; alert and auto-rebalance

Challenges

Cross-Shard Queries

Joins across shards require a scatter-gather pattern — query all shards, merge results at the application or proxy layer. Dramatically slower than single-shard queries.

Mitigation: Denormalize data so common queries stay within one shard. Co-locate related data on the same shard (e.g., user + user’s orders on the same shard).

Cross-Shard Transactions

ACID transactions spanning multiple shards require Two-Phase Commit (2PC) or application-level coordination (Saga pattern). Both add latency and complexity.

Rebalancing

When a shard becomes a hotspot or storage fills up, data must be migrated to new shards. Strategies:

Consistent hashing — Minimizes data movement
Dynamic splitting — Split a shard into two when it exceeds a threshold (like Cassandra’s virtual nodes)
Online migration — Dual-write to old and new shard, then cut over

Schema Changes

DDL must be coordinated across all shards — ALTER TABLE on 50 shards simultaneously. Tools like gh-ost (GitHub) and pt-online-schema-change (Percona) help.

Global Secondary Indexes

Each shard has its own local indexes. A query filtering on a non-shard-key column must hit every shard. Solutions:

Global index table — Separate table mapping secondary key → shard + primary key
Broadcast queries — Accept the fan-out cost for rare queries
Denormalize — Copy the indexed attribute into a secondary Lookup Table sharded differently

Sharding in Practice

System	Sharding Model
Cassandra	Automatic consistent hashing with virtual nodes
MongoDB	Built-in range or hash sharding with config servers
PostgreSQL	Manual (Citus extension for automatic), or application-level
MySQL	Vitess (YouTube’s sharding proxy), ProxySQL, or application-level
CockroachDB	Automatic range-based splitting with Raft consensus
Amazon DynamoDB	Automatic hash-based partitioning, fully managed
Redis Cluster	Hash slots (16384 slots distributed across nodes)

Decision Framework

flowchart TD
    A[Is one server at capacity?] -->|No| B[Use partitioning or read replicas]
    A -->|Yes| C{What's the bottleneck?}
    C -->|Write throughput| D[Horizontal sharding by write-heavy key]
    C -->|Storage| E[Horizontal sharding by data volume]
    C -->|Different workloads| F[Vertical sharding by domain]
    C -->|Latency / compliance| G[Geographic sharding]
    D --> H{Need cross-shard joins?}
    E --> H
    H -->|Rarely| I[Hash-based sharding]
    H -->|Often| J[Range-based or directory sharding<br/>co-locate related data]

Quartz 5

Explorer

Sharding

Sharding

Why Shard

Horizontal Sharding

Shard Key Selection

Horizontal Sharding Strategies

Range-Based Sharding

Hash-Based Sharding

Directory-Based Sharding

Geographic Sharding

Vertical Sharding

When Vertical Sharding Makes Sense

Vertical + Horizontal Combined

Consistent Hashing

Implementation Phases

Challenges

Cross-Shard Queries

Cross-Shard Transactions

Rebalancing

Schema Changes

Global Secondary Indexes

Sharding in Practice

Decision Framework

Sources

Graph View

Table of Contents

Backlinks