Optimal File-Bundle Caching Algorithms for Data-Grids
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
File caching in data intensive scientific applications on data-grids
DMG 2005 Proceedings of the First VLDB conference on Data Management in Grids
Hi-index | 0.00 |
Caching techniques have been used widely to improvethe performance gaps of storage hierarchies in computingsystems. Little is known about the impact of policies on theresponse times of jobs that access and process very largefiles in data grids particularly when data and computationson the data have to be co-located on the same host. In dataintensive applications that access large data files over widearea network environment, such as data-grids, the combinationof policies for job servicing (or scheduling), cachingand cache replacement can significantly impact the performanceof grid jobs. We present some preliminary results ofa simulation study that combines an admission policy with acache replacement policy when servicing jobs submitted toa storage resource manager. The results show that, in comparisonto a first come first serve policy, the response timesof jobs are significantly improved, for practical limits of diskcache sizes, when the jobs that are back-logged to accessthe same files are taken into consideration in schedulingthe next file to be retrieved into the disk cache. Not onlyare the response times of jobs improved, but also the metricmeasures for caching policies, such as the hit ratio and theaverage cost per retrieval, are improved irrespective of thecache replacement policy.