Asynchronous Disk Interleaving: Approximating Access Delays
IEEE Transactions on Computers
An analytic performance model of disk arrays
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Striping in a RAID level 5 disk array
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
Maximizing performance in a striped disk array
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Learning to Classify Parallel Input/Output Access Patterns
IEEE Transactions on Parallel and Distributed Systems
Data partitioning and load balancing in parallel disk systems
The VLDB Journal — The International Journal on Very Large Data Bases
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Issues and Challenges in the Performance Analysis of Real Disk Arrays
IEEE Transactions on Parallel and Distributed Systems
The Panasas ActiveScale Storage Cluster: Delivering Scalable High Bandwidth Storage
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
PVFS: a parallel file system for linux clusters
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Exploring the performance impact of stripe size on network attached storage systems
Journal of Systems Architecture: the EUROMICRO Journal
Design tradeoffs for SSD performance
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A self-tuning disk striping system for parallel input/output
A self-tuning disk striping system for parallel input/output
I/O performance challenges at leadership scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
A study of client-based caching for parallel i/o
A study of client-based caching for parallel i/o
Scalable Earthquake Simulation on Petascale Supercomputers
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Hystor: making the best use of solid state drives in high performance storage systems
Proceedings of the international conference on Supercomputing
Server-side I/O coordination for parallel file systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Hint controlled distribution with parallel file systems
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
The data-intensive scientific applications running on high-end computing system depend on parallel file systems for high-speed data input/output. In most parallel file systems, a file is partitioned into multiple subfiles with a view to allowing it to be accessed concurrently. An important factor in the file partition is the stripe size. However, while working well for certain applications, most existing schemes for determining the stripe size for a file still lack the ability to handle highly concurrent data accesses, which is typical for most parallel scientific applications. To address this problem, this paper presents an analytic model to assess the performance of highly concurrent data accesses at first, and then it describes how to apply this model to select the stripe size of a file. Experimental results demonstrate that the accuracy of the analytic model is around $87.89\%$ and the stripe size selected with it can improve the aggregated I/O bandwidth of \mbox{FLASH I/O} up to $5.8$ times compared with well-known methods. This paper also discusses how to incorporate our method into real-world parallel file systems.