Zoolander: efficient latency management in NoSQL stores

  • Authors:
  • Aniket Chakrabarti;Christopher Stewart;Daiyi Yang;Rean Griffith

  • Affiliations:
  • The Ohio State University;The Ohio State University;StumbleUpon.com;VMWARE

  • Venue:
  • Proceedings of the Posters and Demo Track
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

NoSQL stores expose narrow APIs for data access, e.g., get(key) or put(key, val). While these APIs often give up strong consistency and transactions, they can scale throughput under intense workloads. Widely used stores, e.g., Apache Zookeeper, Cassandra, and Memcached, have been shown to achieve 1010 accesses per day in the face of workload shifts, software faults, and performance bugs. However providing low latency for every access remains challenging. Latency, unlike throughput, quickly yields diminishing returns under scale out approaches, making it important to choose the most efficient approach. Further, DNS timeouts, GC, and other rare system events can hold resources from time to time [3]. These events hardly impact throughput, but they can increase latency for some accesses a lot. Internet services that access a lot of data under tight response time demands need NoSQL stores that provide low latency all the time. Figure 1 depicts such services and their demands. We describe them below. 1. Old-school services, such as e-commerce websites, are increasingly using NoSQL instead of databases. In these embarrassingly parallel services, end-user requests access data independently but each request must complete quickly. Slow accesses translate to unhappy end users. 2. Map reduce services spawn parallel worker nodes that compute local results and forward them to reducers to produce the final output. Here, the term service reflects a growing trend where jobs are expected to complete within response time targets. We note two reasons for this trend. First, jobs that run on pay-as-you-go clouds cost less if they finish within 1-hour leasing intervals. Second, jobs that finish quickly offer qualitative business advantages, e.g., real-time Twitter analysis. 3. Scientific computing as a service is an emerging workload on public clouds [2]. In the past, these workloads ran only on private, custom hardware but public clouds can offer performance-to-cost efficiency. The challenge is matching the absolute performance of private hardware. These workloads use barriers and synchronization heavily. One slow data access can delay the completion of a barrier and ultimately delay the entire workload.