DiskSeen: exploiting disk layout and access history to enhance I/O prefetch

Authors:
Xiaoning Ding;Song Jiang;Feng Chen;Kei Davis;Xiaodong Zhang
Affiliations:
CSE Department, Ohio State University, Columbus, OH;ECE Department, Wayne State University, Detroit, MI;CSE Department, Ohio State University, Columbus, OH;CCS-3 Division, Los Alamos National Laboratory, Los Alamos, NM;CSE Department, Ohio State University, Columbus, OH
Venue:
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Year:
2007

Citing 24
Cited 27

A fast file system for UNIX

ACM Transactions on Computer Systems (TOCS)
On-line extraction of SCSI disk drive parameters

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Informed prefetching and caching

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Implementation and performance of integrated application-controlled file caching, prefetching, and disk scheduling

ACM Transactions on Computer Systems (TOCS)
Automatic compiler-inserted I/O prefetching for out-of-core applications

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Informed multi-process prefetching and caching

SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Automatic I/O hint generation through speculative execution

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
A large-scale study of file-system contents

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
File system usage in Windows NT 4.0

Proceedings of the seventeenth ACM symposium on Operating systems principles
Sequentiality and prefetching in database systems

ACM Transactions on Database Systems (TODS)
Automated disk drive characterization (poster session)

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Popularity-Based Prediction Model for Web Prefetching

Computer
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
Track-Aligned Extents: Matching Access Patterns to Disk Drive Characteristics

FAST '02 Proceedings of the Conference on File and Storage Technologies
Design and Implementation of a Predictive File Prefetching Algorithm

Proceedings of the General Track: 2002 USENIX Annual Technical Conference
The performance impact of kernel prefetching on buffer cache replacement algorithms

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
FS2: dynamic data replication in free disk space for improving disk performance and energy consumption

Proceedings of the twentieth ACM symposium on Operating systems principles
C-Miner: Mining Block Correlations in Storage Systems

FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
DULO: an effective buffer cache management scheme to exploit both temporal and spatial locality

FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
On multidimensional data and modern disks

FAST'05 Proceedings of the 4th conference on USENIX Conference on File and Storage Technologies - Volume 4
Aggressive prefetching: an idea whose time has come

HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Application-controlled file caching policies

USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Reducing file system latency using a predictive approach

USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files

ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference

A buffer cache management scheme exploiting both temporal and spatial localities

ACM Transactions on Storage (TOS)
HMTT: a platform independent full-system memory trace monitoring system

SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
File grouping for scientific data management: lessons from experimenting with real traces

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Context-aware prefetching at the storage server

ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Hiding I/O latency with pre-execution prefetching for parallel applications

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Prefetch throttling and data pinning for improving performance of shared caches

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Profiler and compiler assisted adaptive I/O prefetching for shared storage caches

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A compiler-directed data prefetching scheme for chip multiprocessors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Understanding intrinsic characteristics and system implications of flash memory based solid state drives

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Improving cache global consistency and hit ratio in dependency objects with semantic spatial locality correlations

WSEAS Transactions on Information Science and Applications
Workload characterization in a high-energy data grid and impact on resource management

Cluster Computing
Extract and infer quickly: Obtaining sector geometry of modern hard disk drives

ACM Transactions on Storage (TOS)
Computation mapping for multi-level storage cache hierarchies

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Cashing in on hints for better prefetching and caching in PVFS and MPI-IO

Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A scheduling framework that makes any disk schedulers non-work-conserving solely based on request characteristics

FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Management of Multilevel, Multiclient Cache Hierarchies with Application Hints

ACM Transactions on Computer Systems (TOCS)
Request Bridging and Interleaving: Improving the Performance of Small Synchronous Updates under Seek-Optimizing Disk Subsystems

ACM Transactions on Storage (TOS)
Efficiently identifying working sets in block I/O streams

Proceedings of the 4th Annual International Conference on Systems and Storage
Victim disk first: an asymmetric cache to boost the performance of disk arrays under faulty conditions

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Differentiated storage services

SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
On Urgency of I/O Operations

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Compiler-directed file layout optimization for hierarchical storage systems

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hub: heterogeneous bucketization for database outsourcing

Proceedings of the 2013 international workshop on Security in cloud computing
A Prefetching Scheme Exploiting both Data Layout and Access History on Disk

ACM Transactions on Storage (TOS)
Mortar: filling the gaps in data center memory

Proceedings of the 4th annual Symposium on Cloud Computing
Mortar: filling the gaps in data center memory

Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Compiler-directed file layout optimization for hierarchical storage systems

Scientific Programming - Selected Papers from Super Computing 2012

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current disk prefetch policies in major operating systems track access patterns at the level of the file abstraction. While this is useful for exploiting application-level access patterns, file-level prefetching cannot realize the full performance improvements achievable by prefetching. There are two reasons for this. First, certain prefetch opportunities can only be detected by knowing the data layout on disk, such as the contiguous layout of file metadata or data from multiple files. Second, nonsequential access of disk data (requiring disk head movement) is much slower than sequential access, and the penalty for mis-prefetching a 'random' block, relative to that of a sequential block, is correspondingly more costly. To overcome the inherent limitations of prefetching at the logical file level, we propose to perform prefetching directly at the level of disk layout, and in a portable way. Our technique, called DiskSeen, is intended to be supplementary to, and to work synergistically with, file-level prefetch policies, if present. DiskSeen tracks the locations and access times of disk blocks, and based on analysis of their temporal and spatial relationships, seeks to improve the sequentiality of disk accesses and overall prefetching performance. Our implementation of the DiskSeen scheme in the Linux 2.6 kernel shows that it can significantly improve the effectiveness of prefetching, reducing execution times by 20%-53% for micro-benchmarks and real applications such as grep, CVS, and TPC-H.