SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
The Chubby lock service for loosely-coupled distributed systems
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Bigtable: A Distributed Storage System for Structured Data
ACM Transactions on Computer Systems (TOCS)
Using Memcached for Data Distribution in Industrial Environment
ICONS '08 Proceedings of the Third International Conference on Systems
A simple totally ordered broadcast protocol
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
The case for RAMClouds: scalable high-performance storage entirely in DRAM
ACM SIGOPS Operating Systems Review
Accelerating MapReduce with Distributed Memory Cache
ICPADS '09 Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems
Benchmarking cloud serving systems with YCSB
Proceedings of the 1st ACM symposium on Cloud computing
SCC '10 Proceedings of the 2010 IEEE International Conference on Services Computing
ZooKeeper: wait-free coordination for internet-scale systems
USENIXATC'10 Proceedings of the 2010 USENIX conference on USENIX annual technical conference
Using Memcached to Promote Read Throughput in Massive Small-File Storage System
GCC '10 Proceedings of the 2010 Ninth International Conference on Grid and Cloud Computing
Communications of the ACM
Apache hadoop goes realtime at Facebook
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
NoSQL evaluation: A use case oriented survey
CSC '11 Proceedings of the 2011 International Conference on Cloud and Service Computing
Hi-index | 0.00 |
The improvement of file access performance is a great challenge in real-time cloud services. In this paper, we analyze preconditions of dealing with this problem considering the aspects of requirements, hardware, software, and network environments in the cloud. Then we describe the design and implementation of a novel distributed layered cache system built on the top of the Hadoop Distributed File System which is named HDFS-based Distributed Cache System (HDCache). The cache system consists of a client library and multiple cache services. The cache services are designed with three access layers an in-memory cache, a snapshot of the local disk, and the actual disk view as provided by HDFS. The files loading from HDFS are cached in the shared memory which can be directly accessed by a client library. Multiple applications integrated with a client library can access a cache service simultaneously. Cache services are organized in the P2P style using a distributed hash table. Every file cached has three replicas in different cache service nodes in order to improve robustness and alleviates the workload. Experimental results show that the novel cache system can store files with a wide range in their sizes and has the access performance in a millisecond level in highly concurrent environments.