Summary Creation for Information Discovery in Distributed Systems

  • Authors:
  • Agustin C. Caminero;Eduardo Huedo;Omer Rana;Ignacio M. Llorente;Blanca Caminero;Carmen Carrion

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • PDP '11 Proceedings of the 2011 19th International Euromicro Conference on Parallel, Distributed and Network-Based Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In current distributed systems, such as Grids, Clouds, or P2P systems, the amount of information to handle influences the way the system is managed. In P2P systems containing large quantities of data, or in Grid systems containing a large number of (often heterogeneous) resources, information about data or resources must be spread through the system in an efficient way in order to allow them to be found. An information discovery technique based on data summarization, via clustering, is presented. These summaries can be used to classify information to provide users with greater insight about documents or computing resources compared to raw data. Also, meta-schedulers or brokers would benefit from the proposed technique due to the fact that they would have to deal with less data from resources, thus aiding to the scalability of the system. An evaluation of the approach is subsequently provided to identify the impact of choosing particular parameters to be used as part of the summary.