Mining maximum frequent access patterns in web logs based on unique labeled tree

  • Authors:
  • Ling Zhang;Jian-ping Yin;Yu-bin Zhan

  • Affiliations:
  • School of Computer, National University of Defense Technology, Changsha, Hunan, China;School of Computer, National University of Defense Technology, Changsha, Hunan, China;School of Computer, National University of Defense Technology, Changsha, Hunan, China

  • Venue:
  • WISE'06 Proceedings of the 7th international conference on Web Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Discovering user's Frequent Access Patterns is one of research hotspots in mining web logs. A novel apriori-based algorithm named s-Tree is proposed for mining maximum Frequent Access Patterns. The main contributions of s-Tree algorithm are the following. Firstly, a unique labeled tree is used to represent user session, which enables us to mine the maximum forward reference transaction and the users' preferred access path. Secondly, an improved method of calculating support based on impact factor of content pages first, which helps us to discover some more important and interesting patterns than normal methods. Thirdly, two special strategies are adopted to reduce overheads of joining frequent patterns. Finally, experiments show that s-Tree algorithm is scalable, and is more efficient than previous graph-based structure pattern mining algorithms such as AGM and FSG.