Incremental web-site boundary detection using random walks

  • Authors:
  • Ayesh Alshukri;Frans Coenen;Michele Zito

  • Affiliations:
  • Department of Computer Science, University of Liverpool, Liverpool, UK;Department of Computer Science, University of Liverpool, Liverpool, UK;Department of Computer Science, University of Liverpool, Liverpool, UK

  • Venue:
  • MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper describes variations of the classical k-means clustering algorithm that can be used effectively to address the so called Web-site Boundary Detection (WBD) problem. The suggested advantages offered by these techniques are that they can quickly identify most of the pages belonging to a web-site; and, in the long run, return a solution of comparable (if not better) accuracy than other clustering methods. We analyze our techniques on artificial clones of the web generated using a well-known preferential attachment method.