On learning to predict web traffic

  • Authors:
  • Selwyn Piramuthu

  • Affiliations:
  • Decision and Information Sciences, University of Florida, 351 STZ, Gainesville, FL

  • Venue:
  • Decision Support Systems - Special issue: Web data mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ease of collecting data about customers through the Internet has facilitated the process of developing large repositories of data. These data can and do contain patterns that are useful for the decision maker. Knowledge discovery and data mining methods have been widely used to extract these patterns. It is acknowledged that about 80% of the resources in a majority of data mining applications are spent on cleaning and preprocessing the data. However, there have been relatively few studies on preprocessing data used as input in these data mining systems. In this study, we present a feature selection method based on the Hausdorff distance measure, and evaluate its effectiveness in preprocessing input data for inducing decision trees. Message traffic data from a Web site are used to illustrate performance of the proposed method.