Efficient organization and access of multi-dimensional datasets on tertiary storage systems
Information Systems - Special issue: scientific databases
Efficient Organization of Large Multidimensional Arrays
Proceedings of the Tenth International Conference on Data Engineering
SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management
SSDBM '98 Proceedings of the 10th International Conference on Scientific and Statistical Database Management
Caching and migration for multilevel persistent object stores
MSS '95 Proceedings of the 14th IEEE Symposium on Mass Storage Systems
Managing and serving a multiterabyte data set at the Fermilab DO experiment
MSS '95 Proceedings of the 14th IEEE Symposium on Mass Storage Systems
Automatic Reclustering of Objects in Very Large Databases for High Energy Physics
IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Managing a fragmented XML data cube with oracle and timesten
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Hi-index | 0.00 |
We consider a system in which many users run queries to examine subsets of a large object set. The object set is partitioned into files on tape. A single subset of objects will be visited by multiple queries in the workload. This locality of access creates the opportunity for caching on disk. We introduce and evaluate a novel optimisation, cache filtering, in which the 'hot' objects are automatically extracted from the files that are staged on disk, and then cached separately in new files on disk. Cache filtering can lead to complex situations in the disk cache. We show that these do not prevent effective caching and we introduce a special cache replacement algorithm to maximise efficiency. Through simulations we evaluate the system over a broad range of likely workloads. Depending on workload and system parameters, the cache filtering optimisation yields speedup factors up to 6.