Data prefetching in multiprocessor vector cache memories
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Scheduling algorithms for modern disk drives
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Informed prefetching and caching
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Server-directed collective I/O in Panda
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
An extended two-phase method for accessing sections of out-of-core arrays
Scientific Programming
An efficient profile-analysis framework for data-layout optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Faster Collective Output through Active Buffering
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Profile-guided I/O partitioning
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
An Efficient Lock Protocol for Home-Based Lazy Release Consistency
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Noncontiguous I/O Accesses Through MPI-IO
CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Clusterfile: A Flexible Physical Layout Parallel File System
CLUSTER '01 Proceedings of the 3rd IEEE International Conference on Cluster Computing
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Collective Buffering: Improving Parallel I/O Performance
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Automatic ARIMA Time Series Modeling for Adaptive I/O Prefetching
IEEE Transactions on Parallel and Distributed Systems
Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system
Proceedings of the 18th annual international conference on Supercomputing
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Proceedings of the twentieth ACM symposium on Operating systems principles
The automatic improvement of locality in storage systems
ACM Transactions on Computer Systems (TOCS)
Iteration aware prefetching for large multidimensional datasets
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Towards higher disk head utilization: extracting free bandwidth from busy disk drives
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors
ICPP '93 Proceedings of the 1993 International Conference on Parallel Processing - Volume 01
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
An analytical approach to file prefetching
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Collective caching: application-aware client-side file caching
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Scalable performance of the Panasas parallel file system
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Measurement and analysis of TCP throughput collapse in cluster-based storage systems
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Hiding I/O latency with pre-execution prefetching for parallel applications
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Parallel I/O prefetching using MPI file caching and I/O signatures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
BORG: block-reORGanization for self-optimizing storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
Safe and effective fine-grained TCP retransmissions for datacenter communication
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Scalable I/O tracing and analysis
Proceedings of the 4th Annual Workshop on Petascale Data Storage
InterferenceRemoval: removing interference of disk access for MPI programs through data replication
Proceedings of the 24th ACM International Conference on Supercomputing
Improving Parallel I/O Performance with Data Layout Awareness
CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
A cost-intelligent application-specific data layout scheme for parallel file systems
Proceedings of the 20th international symposium on High performance distributed computing
EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
Hi-index | 0.00 |
Parallel file systems have been developed in recent years to ease the I/O bottleneck of high-end computing system. These advanced file systems offer several data layout strategies in order to meet the performance goals of specific I/O workloads. However, while a layout policy may perform well on some I/O workload, it may not perform as well for another. Peak I/O performance is rarely achieved due to the complex data access patterns. Data access is application dependent. In this study, a cost-intelligent data access strategy based on the application-specific optimization principle is proposed. This strategy improves the I/O performance of parallel file systems. We first present examples to illustrate the difference of performance under different data layouts. By developing a cost model which estimates the completion time of data accesses in various data layouts, the layout can better match the application. Static layout optimization can be used for applications with dominant data access patterns, and dynamic layout selection with hybrid replications can be used for applications with complex I/O patterns. Theoretical analysis and experimental testing have been conducted to verify the proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach can provide up to a 74% performance improvement for data-intensive applications.