How Airbnb Developed a Key-Value Store for Managing Petabytes of Data

prachi kothiyal

2 weeks ago

Airbnb’s operations hinge not only on raw data but also on derived data, which is essential for personalizing user experiences. Derived data refers to information generated from extensive offline datasets processed by tools like Apache Spark or real-time event streams from systems such as Apache Kafka. This type of data plays a critical role in tailoring services according to user activity and preferences.

However, the efficient access and management of this data present unique challenges. The underlying system must be exceptionally reliable to guarantee uninterrupted service. It should also provide high availability, allowing data access without delays, and handle scalability to accommodate the increasing data demands of a platform as expansive as Airbnb. Additionally, low latency is vital since users expect immediate responses without lag.

To address these challenges, Airbnb developed Mussel, a key-value store specifically designed to ensure timely retrieval of the necessary data. This article delves into the architecture of Mussel and explores how Airbnb engineered this key-value store to manage petabytes of data effectively.

Jump to

Evolution of Derived Data Storage at Airbnb

Mussel was not the first solution employed by Airbnb for storing derived data; rather, it represents the culmination of several earlier attempts. Below is an overview of the key stages in this evolutionary process:

Stage 1: Unified Read-Only Key-Value Store

Initially, Airbnb faced numerous technical hurdles in managing derived data effectively. Existing tools such as MySQL, HBase, and RocksDB fell short in meeting critical requirements including:

Handling petabytes of data
Enabling rapid bulk uploads
Providing low-latency access for swift responses
Supporting multi-tenancy for simultaneous use by multiple teams

In response to these challenges, the engineering team created HFileService in 2015. This custom solution utilized HFile, a foundational component for HBase based on Google’s SSTable technology.

The architecture operated as follows:

Scalability Through Sharding and Replication: Data was segmented into smaller units or “shards” based on primary keys, distributing the load across multiple servers and facilitating scalability.
Manual Partition Management with Zookeeper: Zookeeper managed shard locations on servers but required manual updates, complicating scalability as demand increased.
Daily Batch Updates Using Hadoop: Data updates occurred offline through daily Hadoop jobs that transformed raw data into HFile format for cloud storage (S3). Servers would then download their assigned shards from S3 during these updates.

Despite resolving several issues, HFileService had notable limitations:

It was exclusively read-only; real-time updates were not possible.
The reliance on daily processing cycles rendered it inflexible for rapidly changing data needs.

Stage 2: Real-Time and Derived Data Store (Nebula)

In its second phase, Airbnb introduced Nebula to bridge the gap between batch-processed and real-time data access.

Nebula incorporated several enhancements over HFileService:

Combining Batch and Real-Time Data: Nebula utilized HFileService for offline data while integrating DynamoDB for real-time updates.
Timestamp-Based Versioning: Each piece of data received a timestamp to ensure consistency during queries by merging real-time and offline data accordingly.
Minimizing Online Merge Operations: To reduce latency during queries, Nebula implemented daily Spark jobs that pre-merged real-time updates with batch snapshots.

However, Nebula also faced challenges:

High Maintenance Overhead: Managing two systems added complexity in ensuring synchronization between DynamoDB and HFileService.
Inefficient Merging Process: The daily merging job became slow as datasets expanded.
Scalability Challenges: As data grew, scaling Nebula became cumbersome due to its manual maintenance requirements.

Mussel Architecture

In 2018, Airbnb’s engineering team developed Mussel to overcome the limitations encountered with previous systems. Its architecture was specifically designed for enhanced scalability and performance.

Key Features of Mussel’s Architecture

Partition Management with Apache Helix: Mussel increased shard numbers from 8 to 1024 to accommodate growing data needs. Apache Helix automated shard management, dynamically balancing server loads without requiring manual intervention.
Leaderless Replication with Kafka: Utilizing Kafka as a write-ahead log ensured consistent recording and replication of updates across shards while allowing any node holding a shard replica to handle read requests.
Unified Storage Engine with HRegion: Mussel replaced DynamoDB by extending HFileService to manage both real-time and batch data within a unified framework using HRegion from HBase. This facilitated advanced query support and efficient organization of data through LSM Trees and MemStore.
Bulk Load Support: Two types of bulk load pipelines were established from the data warehouse to Mussel via Airflow jobs—Merge Type and Replace Type—to optimize the loading process by only importing incremental changes rather than reloading entire datasets daily.

Adoption and Performance of Mussel

Mussel has become integral to Airbnb’s infrastructure, supporting various services reliant on key-value storage with impressive performance metrics:

It manages approximately 130TB across 4,000 tables in production.
With over 99.9% availability, it provides minimal downtime for services.
Mussel handles over 800,000 read queries per second (QPS) while supporting 35,000 write QPS.
The system achieves an average P95 read latency below 8 milliseconds.

Conclusion

Mussel exemplifies the evolution towards a robust key-value store capable of addressing significant challenges such as scalability and low latency. Its impressive performance metrics highlight its critical role in enabling high-performance data-driven services at Airbnb. Looking ahead, the engineering team remains committed to enhancing Mussel further to support advanced use cases like read-after-write consistency and auto-scaling capabilities.

Read more such articles from our Newsletter here.