Ten lectures on wavelets
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Recurrent Neural Networks for Prediction: Learning Algorithms,Architectures and Stability
Recurrent Neural Networks for Prediction: Learning Algorithms,Architectures and Stability
Host load prediction using linear models
Cluster Computing
IEEE Computational Science & Engineering
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
The Haar Wavelet Transform in the Time Series Similarity Paradigm
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
A Survey of Longest Common Subsequence Algorithms
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Efficient Time Series Matching by Wavelets
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
The workload on parallel supercomputers: modeling the characteristics of rigid jobs
Journal of Parallel and Distributed Computing
How Well Can Simple Metrics Represent the Performance of HPC Applications?
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Emergent (mis)behavior vs. complex software systems
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Analysis and modeling of job arrivals in a production grid
ACM SIGMETRICS Performance Evaluation Review
Preserving time in large-scale communication traces
Proceedings of the 22nd annual international conference on Supercomputing
Future Generation Computer Systems
Workload dynamics on clusters and grids
The Journal of Supercomputing
OPUS: an efficient admissible algorithm for unordered search
Journal of Artificial Intelligence Research
Evaluating similarity-based trace reduction techniques for scalable performance analysis
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Stream Monitoring in Large-Scale Distributed Concealed Environments
E-SCIENCE '09 Proceedings of the 2009 Fifth IEEE International Conference on e-Science
CWS: a model-driven scheduling policy for correlated workloads
Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Realistic Integrated Model of Parallel System Workloads
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Identification, Modelling and Prediction of Non-periodic Bursts in Workloads
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A hybrid Markov chain model for workload on parallel computers
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Managing very large distributed data sets on a data grid
Concurrency and Computation: Practice & Experience - Grid Computing, High Performance and Distributed Application
Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Two denoising methods by wavelet transform
IEEE Transactions on Signal Processing
ATLAS grid workload on NDGF resources: analysis, modeling, and workload generation
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.01 |
Performance evaluations of large-scale systems require the use of representative workloads with certifiable similar or dissimilar characteristics. To quantify the similarity of the characteristics, we describe a novel measure comprising two efficient methods that are suitable for large-scale workloads. One method uses the discrete wavelet transform to assess the periodic time and frequency characteristics in the workload. The second method evaluates dependencies in descriptive attributes via association rule learning. Both methods are evaluated to find the limits of their similarity spaces. Additionally, the wavelet method is evaluated against existing similarity methods and tested for noise robustness and random bias. An empirical study using workloads from seven operational large-scale systems evaluates the measure's accuracy. The results show that our measure is highly resistant to noise, well-suited for large-scale workloads, covers 87% of the possible similarity space, and improves accuracy by 24.5% and standard deviation by 10.8% when compared to existing work.