Mining maximum frequent access patterns in web logs based on unique labeled tree

Authors:
Ling Zhang;Jian-ping Yin;Yu-bin Zhan
Affiliations:
School of Computer, National University of Defense Technology, Changsha, Hunan, China;School of Computer, National University of Defense Technology, Changsha, Hunan, China;School of Computer, National University of Defense Technology, Changsha, Hunan, China
Venue:
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Year:
2006

Citing 10
Cited 0

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Web mining research: a survey

ACM SIGKDD Explorations Newsletter
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Access Patterns Efficiently from Web Logs

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Web usage mining: discovery and applications of usage patterns from Web data

ACM SIGKDD Explorations Newsletter
Chopper: efficient algorithm for tree mining

Journal of Computer Science and Technology
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees

IEEE Transactions on Knowledge and Data Engineering
Mining Web Log Sequential Patterns with Position Coded Pre-Order Linked WAP-Tree

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discovering user's Frequent Access Patterns is one of research hotspots in mining web logs. A novel apriori-based algorithm named s-Tree is proposed for mining maximum Frequent Access Patterns. The main contributions of s-Tree algorithm are the following. Firstly, a unique labeled tree is used to represent user session, which enables us to mine the maximum forward reference transaction and the users' preferred access path. Secondly, an improved method of calculating support based on impact factor of content pages first, which helps us to discover some more important and interesting patterns than normal methods. Thirdly, two special strategies are adopted to reduce overheads of joining frequent patterns. Finally, experiments show that s-Tree algorithm is scalable, and is more efficient than previous graph-based structure pattern mining algorithms such as AGM and FSG.