Designing Data-Intensive Applications - 6. Partitioning

Open Table Of Contents

Partitioning and Replication

Partitioning: Divides data into smaller subsets, each belonging to a single partition.
Replication: Often works in conjunction with partitioning to ensure fault tolerance.
- These processes are mostly independent but complement each other.

The goal is to distribute data evenly across nodes to avoid skewed partitions and hot spots.

Key-Range Partitioning:
- Data is sorted by keys and split into ranges.
- Suitable for range queries.
- Challenges:
  - Requires continuous adjustment of boundaries.
  - Risk of hot spots.
Hash-Based Partitioning:
- Uses a hash function (e.g., MD5 or Fowler–Noll–Vo) to assign keys to partitions.
- Ensures uniform distribution of data.
- Advantages:
  - Handles skewed data effectively.
  - Supports partitioning of compound primary keys.
- Limitations:
  - Range queries are not supported (e.g., MongoDB’s hash-based sharding).
  - Extreme read/write skew on a single key requires application-level handling.

Key-Range Partitioning: Bigtable, HBase, RethinkDB, MongoDB (pre-v2.4).
Hash-Based Partitioning: Cassandra, Voldemort, Couchbase, MongoDB (hash-sharding).

Secondary indexes complicate partitioning as they don’t map neatly to partitions.

Partition-Local Secondary Indexes:
- Each partition has its own index for the data it contains.
- Challenges:
  - For queries spanning multiple partitions, the client must query all partitions.
Global Secondary Indexes:
- Partitioned by terms rather than documents.
- Efficient for reads but increases write complexity.
- Updates are often asynchronous for performance.

Over time, data rebalancing is necessary to distribute load evenly across nodes.

Hash Mod N:
- Inefficient; most keys are relocated when the number of nodes changes.
Fixed Number of Partitions:
- Pre-define more partitions than nodes (e.g., 1,000 partitions for 10 nodes).
- Allows dynamic assignment to nodes as the cluster scales.
- Technologies: Riak, Elasticsearch, Couchbase, Voldemort.
Dynamic Partitioning:
- Partitions grow or shrink based on size thresholds.
- Adaptable to the total data volume.
- Works like a top-level B-tree.
Proportional Partitioning:
- Fixed number of partitions per node.
- New nodes split random partitions and take half the data.
- Risk of uneven splits.
Manual Intervention:
- While automation exists, human oversight helps prevent unpredictable behavior.

Mapping keys to partitions and their hosting nodes is critical for efficient routing.

Node-Forwarding:
- Clients connect to any node, which forwards requests to the correct node if needed.
Routing Tier:
- A partition-aware load balancer directs requests to the appropriate node.
Partition-Aware Clients:
- Clients directly connect to the correct node, eliminating intermediaries.

ZooKeeper:
- Manages metadata (e.g., partition-to-node mappings).
- Tracks cluster changes, such as node additions or partition reassignments.
- Notifies clients, routing tiers, or nodes of updates.

Contribute to this article here.