Parallel database systems: the future of high performance database systems
Communications of the ACM
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
The log-structured merge-tree (LSM-tree)
Acta Informatica
Towards robust distributed systems (abstract)
Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
B-tree indexes for high update rates
ACM SIGMOD Record
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Communications of the ACM - Web science
Data management projects at Google
ACM SIGMOD Record
Efficient bulk insertion into a distributed ordered table
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Queue - Scalable Web Services
Consistency rationing in the cloud: pay only when it matters
Proceedings of the VLDB Endowment
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
Consistability: describing usually consistent systems
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
ZooKeeper: wait-free coordination for internet-scale systems
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Parallel bulk insertion for large-scale analytics applications
Proceedings of the 4th International Workshop on Large Scale Distributed Systems and Middleware
CloudCmp: comparing public cloud providers
IMC '10 Proceedings of the 10th ACM SIGCOMM conference on Internet measurement
Comet: an active distributed key-value store
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Transactions on large-scale data- and knowledge-centered systems II
Otus: resource attribution in data-intensive clusters
Proceedings of the second international workshop on MapReduce and its applications
Consistency models for replicated data
Replication
Gumball: a race condition prevention technique for cache augmented SQL database management systems
DBSocial '12 Proceedings of the 2nd ACM SIGMOD Workshop on Databases and Social Networks
Rya: a scalable RDF triple store for the clouds
Proceedings of the 1st International Workshop on Cloud Intelligence
Solving big data challenges for enterprise application performance management
Proceedings of the VLDB Endowment
Proceedings of the 2012 workshop on Management of big data systems
Toward a principled framework for benchmarking consistency
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Position paper: cloud system deployment and performance evaluation tools for distributed databases
Proceedings of the 2013 international workshop on Hot topics in cloud services
BigBench: towards an industry standard benchmark for big data analytics
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
D-Zipfian: a decentralized implementation of Zipfian
Proceedings of the Sixth International Workshop on Testing Database Systems
Expedited rating of data stores using agile data loading techniques
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Limplock: understanding the impact of limpware on scale-out cloud systems
Proceedings of the 4th annual Symposium on Cloud Computing
Client-centric benchmarking of eventual consistency for cloud storage systems
Proceedings of the 4th annual Symposium on Cloud Computing
Eventually consistent: not what you were expecting?
Communications of the ACM
Eventually Consistent: Not What You Were Expecting?
Queue - Performance
Hi-index | 0.02 |
Inspired by Google's BigTable, a variety of scalable, semi-structured, weak-semantic table stores have been developed and optimized for different priorities such as query speed, ingest speed, availability, and interactivity. As these systems mature, performance benchmarking will advance from measuring the rate of simple workloads to understanding and debugging the performance of advanced features such as ingest speed-up techniques and function shipping filters from client to servers. This paper describes YCSB++, a set of extensions to the Yahoo! Cloud Serving Benchmark (YCSB) to improve performance understanding and debugging of these advanced features. YCSB++ includes multi-tester coordination for increased load and eventual consistency measurement, multi-phase workloads to quantify the consequences of work deferment and the benefits of anticipatory configuration optimization such as B-tree pre-splitting or bulk loading, and abstract APIs for explicit incorporation of advanced features in benchmark tests. To enhance performance debugging, we customized an existing cluster monitoring tool to gather the internal statistics of YCSB++, table stores, system services like HDFS, and operating systems, and to offer easy post-test correlation and reporting of performance behaviors. YCSB++ features are illustrated in case studies of two BigTable-like table stores, Apache HBase and Accumulo, developed to emphasize high ingest rates and finegrained security.