An analytical model for multi-tier internet services and its applications
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
DataSeries: an efficient, flexible data format for structured serial data
ACM SIGOPS Operating Systems Review
Profiling and modeling resource usage of virtualized applications
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Capture, conversion, and analysis of an intense NFS workload
FAST '09 Proccedings of the 7th conference on File and storage technologies
Hi-index | 0.00 |
Modern IT environments collect and analyze increasingly large volumes of data for a growing number of purposes (e.g., automated management, security, regulatory compliance, etc.). Simultaneously, such environments are challenged by the need to minimize their environmental footprints. A general solution to this problem is to utilize IT resources more efficiently. This paper describes our work to systematically evaluate the inefficiencies in the information collection and analysis of several widely-used IT applications, to implement a more efficient solution, and to quantify the improvements. In particular, the logging of HTTP transactions by the Apache Web server and of network events by the Bro intrusion detection system are converted from text files to DataSeries. The costs of recording, storing and analyzing the information in the different formats are thoroughly evaluated and compared. We converted the text logs to DataSeries online, with no discernable overhead on the logging applications. We achieved upto a 7x decrease in the logfile sizes relative to the sizes of the default text logs, and speedups of 3x-8.4x to analyze the logfiles.