Using Grid services to parallelize IBM's Generic Log Adapter

Authors:
Fatos Xhafa;Claudi Paniagua;Leonard Barolli;Santi Caballé
Affiliations:
Department of Computer Science and Information Systems, Birkbeck, University of London, 23-29 Emerald Street, London WC1N 3QS, United Kingdom;IBM GTS, Virtualization and Grid Computing, Barcelona, Spain;Department of Information and Communication Engineering, Fukuoka Institute of Technology (FIT), 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka 811-0295, Japan;Department of Information Sciences, UOC, Av. Tibidabo 39-43, 08035 Barcelona, Spain
Venue:
Journal of Systems and Software
Year:
2011

Citing 4
Cited 1

Introduction to Automata Theory, Languages and Computability

Introduction to Automata Theory, Languages and Computability
Adaptive Scheduling for Master-Worker Applications on the Computational Grid

GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
An Enabling Framework for Master-Worker Applications on the Computational Grid

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
The Grid 2: Blueprint for a New Computing Infrastructure

The Grid 2: Blueprint for a New Computing Infrastructure

Structured and Interoperable Logging for the Cloud Computing Era: The Pitfalls and Benefits

UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Since their definition in the Open Grid Services Architecture, Grid services has been used in many Grid-enabled applications to leverage the computational power offered by Grid Systems. An important research issue addressed in this regard is how to increase the efficiency of the Grid services for a massive processing and scientific computing computations arising in data intensive computations, for example the processing of large log data files arising in ''problem determination'' in today's IT computing environments. In this paper we present an approach that uses Grid services to efficiently parallelize the IBM's Generic Log Adapter (GLA). GLA is a generic parsing engine shipped with the IBM's Autonomic Computing Toolkit that has been conceived to convert proprietary log data into a standard log data event-based format in real time. However, in order to provide generic support for parsing the majority of today's unstructured log data formats the GLA makes heavy use of regular expressions that incur in performance limitations. Until now all the approaches that have been proposed to increase GLA's performance have revolved around fine-tuning the set of regular expressions used to configure the GLA for a particular log data format or writing specific parsing code. In this work we propose a new approach consisting in transparently parallelizing the GLA by taking advantage of its internal architecture and the fact that structuring log data is a task that lends itself very well to parallelization. We present a Master-Worker strategy that uses Grid services to parallelize GLA efficiently and in a completely transparent way for the user.