Table Of Contents
Open Table Of Contents
Partitioning and Replication
-
Partitioning: Divides data into smaller subsets, each belonging to a single partition.
-
Replication: Often works in conjunction with partitioning to ensure fault tolerance.
- These processes are mostly independent but complement each other.
Partitioning Key-Value Data
The goal is to distribute data evenly across nodes to avoid skewed partitions and hot spots.
Strategies
-
Key-Range Partitioning:
- Data is sorted by keys and split into ranges.
- Suitable for range queries.
- Challenges:
- Requires continuous adjustment of boundaries.
- Risk of hot spots.
-
Hash-Based Partitioning:
- Uses a hash function (e.g., MD5 or Fowler–Noll–Vo) to assign keys to partitions.
- Ensures uniform distribution of data.
- Advantages:
- Handles skewed data effectively.
- Supports partitioning of compound primary keys.
- Limitations:
- Range queries are not supported (e.g., MongoDB’s hash-based sharding).
- Extreme read/write skew on a single key requires application-level handling.
Technologies
-
Key-Range Partitioning: Bigtable, HBase, RethinkDB, MongoDB (pre-v2.4).
-
Hash-Based Partitioning: Cassandra, Voldemort, Couchbase, MongoDB (hash-sharding).
Partitioning Secondary Indexes
Secondary indexes complicate partitioning as they don’t map neatly to partitions.
Techniques
-
Partition-Local Secondary Indexes:
- Each partition has its own index for the data it contains.
- Challenges:
- For queries spanning multiple partitions, the client must query all partitions.
-
Global Secondary Indexes:
- Partitioned by terms rather than documents.
- Efficient for reads but increases write complexity.
- Updates are often asynchronous for performance.
Rebalancing Partitions
Over time, data rebalancing is necessary to distribute load evenly across nodes.
Requirements
- Fair load distribution after rebalancing.
- Continued acceptance of reads/writes during the process.
- Minimal data movement between nodes.
Methods
-
Hash Mod N:
- Inefficient; most keys are relocated when the number of nodes changes.
-
Fixed Number of Partitions:
- Pre-define more partitions than nodes (e.g., 1,000 partitions for 10 nodes).
- Allows dynamic assignment to nodes as the cluster scales.
- Technologies: Riak, Elasticsearch, Couchbase, Voldemort.
-
Dynamic Partitioning:
- Partitions grow or shrink based on size thresholds.
- Adaptable to the total data volume.
- Works like a top-level B-tree.
-
Proportional Partitioning:
- Fixed number of partitions per node.
- New nodes split random partitions and take half the data.
- Risk of uneven splits.
-
Manual Intervention:
- While automation exists, human oversight helps prevent unpredictable behavior.
Request Routing
Mapping keys to partitions and their hosting nodes is critical for efficient routing.
Approaches
-
Node-Forwarding:
- Clients connect to any node, which forwards requests to the correct node if needed.
-
Routing Tier:
- A partition-aware load balancer directs requests to the appropriate node.
-
Partition-Aware Clients:
- Clients directly connect to the correct node, eliminating intermediaries.
Coordination Services
-
ZooKeeper:
- Manages metadata (e.g., partition-to-node mappings).
- Tracks cluster changes, such as node additions or partition reassignments.
- Notifies clients, routing tiers, or nodes of updates.
Contribute to this article here.