Intelligent Web mining

  • Authors:
  • Ernestina Menasalvas;Oscar Marbán;Socorro Millán;Jose M. Peña

  • Affiliations:
  • DLSIIS Facultad de Informática, U.P.M., Madrid. Spain;Departamento de Informática, Universidad Carlos III, Madrid. Spain;Universidad del Valle, Cali. Colombia;DATSI Facultad de Informática, U.P.M., Madrid. Spain

  • Venue:
  • Intelligent exploration of the web
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Explosive growth in size and usage of the World Wide Web has made it Necessary for Web site administrators to track and analyze the navigation patterns of Web site visitors. However, data mining techniques are not easily applicable to Web data due to problems both related with the technology underlying the Web and the lack of standards in the design and implementation of Web pages. Information collected by Web servers and kept in the server log is the main source of data for analyzing user navigation patterns.Once logs have been preprocessed and sessions have been obtained there are several kinds of access pattern mining that can be performed depending on the needs of the analyst. It is important to mention that most efforts have relied on relatively simple techniques which can be inadequate for real user profile data since noise in the data has to be firstly tacked. Thus, there is a need for robust methods that integrates different intelligent techniques that are free of any assumptions about the noise contamination rate.In this paper, the problem of mining behavior patterns on the Web is studied: in detail and different approaches to solve the problem are analyzed. An algorithm is given to calculate frequent access patterns. This algorithm is based on a model structure that has been called WPC-Tree that stores in each node relevant information about pages that make it possible to apply data mining techniques to obtain useful patterns.