The log-structured merge-tree (LSM-tree)
Acta Informatica
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Bigtable: a distributed storage system for structured data
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Rose: compressed, log-structured replication
Proceedings of the VLDB Endowment
PNUTS: Yahoo!'s hosted data serving platform
Proceedings of the VLDB Endowment
Modular data storage with Anvil
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
Tree indexing on solid state drives
Proceedings of the VLDB Endowment
Cloudy: a modular cloud storage system
Proceedings of the VLDB Endowment
Availability in globally distributed storage systems
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Hi-index | 0.00 |
A cloud storage with persistence shows solid performance only with a read heavy or write heavy workload. There is a trade-off between the read-optimized and write-optimized design of a cloud storage. This is dominated by its storage engine, which is a software component for managing data stored on memory and disk. A storage engine can be pluggable with an adequate software design though today's cloud storages are not always modular. We developed a modular cloud storage called MyCassandra to demonstrate that such a cloud storage can be read-optimized and write-optimized with a modular design. Various storage engines can be introduced into MyCassandra and they determine with what workload the cloud storage can perform well. With MyCassandra we proved that such a modular design enables a cloud storage to adapt to workloads. We propose a method to build a cloud storage that performs well with both read heavy and write heavy workloads. A heterogeneous cluster is built from MyCassandra nodes with different storage engines, read-optimized, write-optimized, and on-memory (read-and-write-optimized). A query is routed to nodes that efficiently process it while the cluster maintains consistency between data replicas with a quorum protocol. The cluster showed comparable performance with the original Cassandra for write heavy workloads, and it showed considerably better performance for read heavy workloads. With read-only workload, read latency was 90.4% lower than and throughput was 11.00 times as high as Cassandra.