Advanced Data Preprocessing for Intersites Web Usage Mining

  • Authors:
  • Doru Tanasa;Brigitte Trousse

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, Web usage mining has emerged as a new field of data mining and gained increasing attention from both the business and research communities. A particular area of importance is data preprocessing for Intersites WUM. The proposed methodology for this process has two main objectives. The first is to use classical preprocessing (data fusion, data cleaning, and data structuration) to significantly reduce, but in a relevant manner, the size of the Web servers' log files. The second is to use advanced data preprocessing, which employs an extra step called data summarization to increase the quality of data obtained after classical preprocessing. To validate this methodology's efficiency, an experiment joined and analyzed log files from four related servers.