On the expressiveness and trade-offs of large scale tuple stores

Authors:
Ricardo Vilaça;Francisco Cruz;Rui Oliveira
Affiliations:
Computer Science and Technology Center, Universidade do Minho, Braga, Portugal;Computer Science and Technology Center, Universidade do Minho, Braga, Portugal;Computer Science and Technology Center, Universidade do Minho, Braga, Portugal
Venue:
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Year:
2010

Citing 21
Cited 1

Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Cluster-based scalable network services

Proceedings of the sixteenth ACM symposium on Operating systems principles
Towards robust distributed systems (abstract)

Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Testing the Dependability and Performance of Group Communication Based Database Replication Protocols

DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
LINQ: reconciling object, relations and XML in the .NET framework

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Interpreting the data: Parallel analysis with Sawzall

Scientific Programming - Dynamic Grids and Worldwide Computing
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Bigtable: a distributed storage system for structured data

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Why we twitter: understanding microblogging usage and communities

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A few chirps about twitter

Proceedings of the first workshop on Online social networks
Correlation-Aware Object Placement for Multi-Object Operations

ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
ProtoPeer: From Simulation to Live Deployment in One Step

P2P '08 Proceedings of the 2008 Eighth International Conference on Peer-to-Peer Computing
PNUTS: Yahoo!'s hosted data serving platform

Proceedings of the VLDB Endowment
Clouder: a flexible large scale decentralized object store: architecture overview

Proceedings of the Third Workshop on Dependable Distributed Data Management
Simulation of main memory database parallel recovery

SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter

HICSS '10 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences

A correlation-aware data placement strategy for key-value stores

Proceedings of the 11th IFIP WG 6.1 international conference on Distributed applications and interoperable systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Massive-scale distributed computing is a challenge at our doorstep. The current exponential growth of data calls for massive-scale capabilities of storage and processing. This is being acknowledged by several major Internet players embracing the cloud computing model and offering first generation distributed tuple stores. Having all started from similar requirements, these systems ended up providing a similar service: A simple tuple store interface, that allows applications to insert, query, and remove individual elements. Furthermore, while availability is commonly assumed to be sustained by the massive scale itself, data consistency and freshness is usually severely hindered. By doing so, these services focus on a specific narrow trade-off between consistency, availability, performance, scale, and migration cost, that is much less attractive to common business needs. In this paper we introduce DataDroplets, a novel tuple store that shifts the current trade-off towards the needs of common business users, providing additional consistency guarantees and higher level data processing primitives smoothing the migration path for existing applications. We present a detailed comparison between DataDroplets and existing systems regarding their data model, architecture and trade-offs. Preliminary results of the system's performance under a realistic workload are also presented.