A Theory for Multiresolution Signal Decomposition: The Wavelet Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Input/output behavior of supercomputing applications
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Orthonormal bases of compactly supported wavelets II: variations on a theme
SIAM Journal on Mathematical Analysis
A static analysis of I/O characteristics of scientific applications in a production workload
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scaling up dynamic time warping for datamining applications
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A first order approximation to the optimum checkpoint interval
Communications of the ACM
Statistical scalability analysis of communication operations in distributed applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Discrete Time Processing of Speech Signals
Discrete Time Processing of Speech Signals
Processor allocation and checkpoint interval selection in cluster computing systems
Journal of Parallel and Distributed Computing - Special issue on cluster and network-based computing
A Survey of Longest Common Subsequence Algorithms
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Exact indexing of dynamic time warping
Knowledge and Information Systems
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Trace: parallel trace replay with approximate causal events
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Characterizing the I/O behavior of scientific applications on the Cray XT
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Parallel I/O prefetching using MPI file caching and I/O signatures
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Autograph: automatically extracting workflow file signatures
ACM SIGOPS Operating Systems Review
Alignment of Noisy and Uniformly Scaled Time Series
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
A higher order estimate of the optimum checkpoint interval for restart dumps
Future Generation Computer Systems
HPCTOOLKIT: tools for performance analysis of optimized parallel programs http://hpctoolkit.org
Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
A model for predicting the optimum checkpoint interval for restart dumps
ICCS'03 Proceedings of the 2003 international conference on Computational science
Efficient object storage journaling in a distributed parallel file system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Discovery of application workloads from network file traces
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Accelerating S3D: a GPGPU case study
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
PAC: Pattern-driven Application Consolidation for Efficient Cloud Computing
MASCOTS '10 Proceedings of the 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems
Dynamic time warping constraint learning for large margin nearest neighbor classification
Information Sciences: an International Journal
Understanding and improving computational science storage access through continuous characterization
MSST '11 Proceedings of the 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies
Extracting flexible, replayable models from large block traces
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
Wavelet transform domain filters: a spatially selective noise filtration technique
IEEE Transactions on Image Processing
Hi-index | 0.00 |
Competing workloads on a shared storage system cause I/O resource contention and application performance vagaries. This problem is already evident in today's HPC storage systems and is likely to become acute at exascale. We need more interaction between application I/O requirements and system software tools to help alleviate the I/O bottleneck, moving towards I/O-aware job scheduling. However, this requires rich techniques to capture application I/O characteristics, which remain evasive in production systems. Traditionally, I/O characteristics have been obtained using client-side tracing tools, with drawbacks such as non-trivial instrumentation/development costs, large trace traffic, and inconsistent adoption. We present a novel approach, I/O Signature Identifier (IOSI), to characterize the I/O behavior of data-intensive applications. IOSI extracts signatures from noisy, zero-overhead server-side I/O throughput logs that are already collected on today's supercomputers, without interfering with the compiling/execution of applications. We evaluated IOSI using the Spider storage system at Oak Ridge National Laboratory, the S3D turbulence application (running on 18,000 Titan nodes), and benchmark-based pseudo-applications. Through our experiments we confirmed that IOSI effectively extracts an application's I/O signature despite significant server-side noise. Compared to client-side tracing tools, IOSI is transparent, interface-agnostic, and incurs no overhead. Compared to alternative data alignment techniques (e.g., dynamic time warping), it offers higher signature accuracy and shorter processing time.