Principles of distributed database systems (2nd ed.)
Principles of distributed database systems (2nd ed.)
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
The budgeted maximum coverage problem
Information Processing Letters
A survey of web caching schemes for the Internet
ACM SIGCOMM Computer Communication Review
Impact of Admission and Cache Replacement Policies on Response Times of Jobs on Data Grids
CLADE '03 Proceedings of the 1st International Workshop on Challenges of Large Applications in Distributed Environments
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
MySRB & SRB: Components of a Data Grid
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Accurate Modeling of Cache Replacement Policies in a Data Grid
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Using bitmap index for interactive exploration of large datasets
SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
FreeLoader: Scavenging Desktop Storage Resources for Scientific Data
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Constructing collaborative desktop storage caches for large scientific datasets
ACM Transactions on Storage (TOS)
Coupling prefix caching and collective downloads for remote dataset access
Proceedings of the 20th annual international conference on Supercomputing
File grouping for scientific data management: lessons from experimenting with real traces
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
enabling cross-layer optimizations in storage systems with custom metadata
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
BitDew: a programmable environment for large-scale data management and distribution
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Modeling replication strategies in data grid systems with arbitrary clustered demands
Proceedings of the 3rd international conference on Scalable information systems
Object Caching for Queries and Updates
WALCOM '09 Proceedings of the 3rd International Workshop on Algorithms and Computation
Journal of Network and Computer Applications
A workload-driven unit of cache replacement for mid-tier database caching
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Efficient querying of distributed provenance stores
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
JAWS: Job-Aware Workload Scheduling for the Exploration of Turbulence Simulations
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
CoScan: cooperative scan sharing in the cloud
Proceedings of the 2nd ACM Symposium on Cloud Computing
Hi-index | 0.00 |
The file-bundle caching problem arises frequently in scientific applications where jobs process several files concurrently. Consider a host system in a data-grid that maintains a disk cache for servicing jobs of file requests where a job is serviced only if all its requested files are present in the disk cache. Files must now be admitted into the cache and replaced in sets of file-bundles. We show that traditional caching algorithms based on file popularity measures do not perform well since they may hold in cache non-relevant combinations of files. We present and analyze a new caching algorithm for maximizing the throughput of jobs and minimizing data replacement costs at such data-grid hosts. We tested the new algorithm using a disk cache simulation model under a wide range of conditions of file request distributions, varying cache size, file size distribution, etc. The results show significant improvement over traditional caching algorithms.