The Weka4WS framework for distributed data mining in service-oriented Grids

  • Authors:
  • Domenico Talia;Paolo Trunfio;Oreste Verta

  • Affiliations:
  • DEIS, University of Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy;DEIS, University of Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy;DEIS, University of Calabria, Via P. Bucci 41C, 87036 Rende (CS), Italy

  • Venue:
  • Concurrency and Computation: Practice & Experience
  • Year:
  • 2008

Quantified Score

Hi-index 0.02

Visualization

Abstract

The service-oriented architecture paradigm can be exploited for the implementation of data and knowledge-based applications in distributed environments. The Web services resource framework (WSRF) has recently emerged as the standard for the implementation of Grid services and applications. WSRF can be exploited for developing high-level services for distributed data mining applications. This paper describes Weka4WS, a framework that extends the widely used open source Weka toolkit to support distributed data mining on WSRF-enabled Grids. Weka4WS adopts the WSRF technology for running remote data mining algorithms and managing distributed computations. The Weka4WS user interface supports the execution of both local and remote data mining tasks. On every computing node, a WSRF-compliant Web service is used to expose all the data mining algorithms provided by the Weka library. The paper describes the design and implementation of Weka4WS using the WSRF libraries and services provided by Globus Toolkit 4. A performance analysis of Weka4WS for executing distributed data mining tasks in different network scenarios is presented. Copyright © 2008 John Wiley & Sons, Ltd.