A model for fast web mining prototyping

  • Authors:
  • Álvaro Pereira;Ricardo Baeza-Yates;Nivio Ziviani;Jesús Bisbal

  • Affiliations:
  • Federal Univ. of Minas Gerais, Belo Horizonte, Brazil;Yahoo! Research & Barcelona Media, Barcelona, Spain;Federal Univ. of Minas Gerais, Belo Horizonte, Brazil;Universitat Pompeu Fabra, Barcelona, Spain

  • Venue:
  • Proceedings of the Second ACM International Conference on Web Search and Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Web mining is a computation intensive task, even after the mining tool itself has been developed. Most mining software are developed ad-hoc and usually are not scalable nor reused for other mining tasks. The objective of this paper is to present a model for fast Web mining prototyping, referred to as WIM -- Web Information Mining. The underlying conceptual model of WIM provides its users with a level of abstraction appropriate for prototyping and experimentation throughout the Web data mining task. Abstracting from the idiosyncrasies of raw Web data representations facilitates the inherently iterative mining process. We present the WIM conceptual model, its associated algebra, and the WIM tool software architecture, which implements the WIM model. We also illustrate how the model can be applied to real Web data mining tasks. The experimentation of WIM in real use cases has shown to significantly facilitate Web mining prototyping.