Designing Data-Intensive Applications - 3. Storage and Retrieval

Open Table Of Contents

Data Structures in Databases

A log is an append-only sequence of records used by many databases.
- Challenges: concurrency control, disk space management, error handling, and crash recovery.
Benefits of append-only design:
- Sequential writes are faster than random writes.
- Simplifies concurrency and crash recovery.
- Merging segments avoids file fragmentation.

Key-value stores resemble hash tables.
Example: Bitcask (default storage engine in Riak).
- Uses in-memory hash maps to map keys to byte offsets in log files.
- Requires all keys to fit in RAM, but values can reside on disk.
Disk space management:
- Use segmented logs and periodically perform compaction to remove duplicate keys and merge segments.
- Compaction ensures only the latest updates are kept, optimizing storage and lookup performance.

SSTables (Sorted String Tables): Key-value pairs sorted by key.
- Efficient merging via algorithms like mergesort.
- Requires only partial in-memory indexes.
- Group records into compressed blocks for efficient storage.
LSM-Trees (Log-Structured Merge Trees):
- Uses in-memory balanced trees (e.g., red-black trees) to handle writes.
- Periodically writes the in-memory tree to disk as SSTable files.
- Background compaction merges and optimizes segments.
- Examples: LevelDB, RocksDB (leveled compaction); HBase, Cassandra (size-tiered compaction).
- Bloom filters help optimize lookups for non-existent keys.

Widely used indexing structure.
Key-value pairs are stored in fixed-size pages, enabling efficient lookups and range queries.
Operations:
- Updates and inserts modify pages and split them if needed.
- Crash recovery handled via Write-Ahead Logs (WAL).
Comparison with LSM-Trees:
- Faster reads due to single-location indexing.
- Predictable performance, but lower write throughput and higher fragmentation.

Enable efficient querying on non-primary fields.
Techniques:
- Store lists of matching row identifiers for each indexed value.
- Append unique row identifiers to each index entry.

Fuzzy querying capabilities (e.g., synonyms, grammatical variations).
Example: Lucene (used by Elasticsearch and Solr).
- Term dictionary stored using SSTable-like structures.

Entire dataset resides in memory for low-latency access.
Examples:
- Memcached (cache-focused, weak durability).
- VoltDB, MemSQL (durable in-memory databases).
Benefits:
- Avoid overhead of disk I/O.
- Enable complex data models.

Optimized for analytical queries.
Stores data by columns instead of rows, enabling efficient querying and compression.
Techniques:
- Bitmap encoding for filtering and aggregation.
- Materialized views for caching query results (e.g., OLAP cubes).

Contribute to this article here.