The use of web structure and content to identify subjectively interesting web usage patterns

  • Authors:
  • Robert Cooley

  • Affiliations:
  • KXEN, Inc., San Francisco, CA

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The discipline of Web Usage Mining has grown rapidly in the past few years, despite the crash of the e-commerce boom of the late 1990s. Web Usage Mining is the application of data mining techniques to Web clickstream data in order to extract usage patterns. Yet, with all of the resources put into the problem, claims of success have been limited and are often tied to specific Web site properties that are not found in general. One reason for the limited success has been a component of Web Usage Mining that is often overlooked---the need to understand the content and structure of a Web site. The processing and quantification of a Web sites content and structure for all but completely static and single frame Web sites is arguably one of the most difficult tasks to automate in the Web Usage Mining process. This article shows that, not only is the Web Usage Mining process enhanced by content and structure, it cannot be completed without it. The results of experiments run on data from a large e-commerce site are presented to show that proper preprocessing cannot be completed without the use of Web site content and structure, and that the effectiveness of pattern analysis is greatly enhanced.