Data mining of web access logs from an academic web site

  • Authors:
  • Vic Ciesielski;Anand Lalani

  • Affiliations:
  • Department of Computer Science and Information Technology, RMIT University;Department of Computer Science and Information Technology, RMIT University

  • Venue:
  • Design and application of hybrid intelligent systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have used a general purpose data mining tool to determine whether we can find any 'golden nuggets' in the web access logs of a large academic web site. Our goal was to use general purpose data mining algorithms to analyse visitors to the website and somehow characterise or distinguish them in some way. We used two web access logs, one from 2001 and one from 2003. We extracted 4 different feature sets from the web logs and used algorithms for classification (1R, J48/C4.5), clustering (EM), association finding (apriori) and feature selection (correlation based subset evaluation with best first search). We discovered several nuggets, the most significant being that a major difference between visitors from within Australia and visitors from outside Australia is that visitors from outside Australia generally arrive via search engines and are interested in information about postgraduate courses.