Identification, Modelling and Prediction of Non-periodic Bursts in Workloads
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A similarity measure for time, frequency, and dependencies in large-scale workloads
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
We present a probabilistic tracing method that captures both user and system behaviour for large-scale distributed applications. Our method extends the notion of data stream monitoring to work within what we define as concealed environments. We detail the conceptual design and implementation of our method. Additionally, we evaluate the scalability of the tracing method in a real petabyte-scale distributed data management system. Finally, we demonstrate the usefulness of the collected trace data in three scenarios. First, we use collected trace data to examine the arrival of user events and find self-similar processes. Second, we examine the behaviour and performance of mass storage systems in a grid under concurrent requests. Third, we develop a model for prediction of user event arrivals based on historical data. Our results suggest that a probabilistic tracing method is scalable, straightforward to integrate with existing applications, and provides useful insight into the behaviour of very large-scale applications.