Analyzing clickstreams using subsessions
Proceedings of the 3rd ACM international workshop on Data warehousing and OLAP
A fine grained heuristic to capture web navigation patterns
ACM SIGKDD Explorations Newsletter
Introduction to Algorithms
Statistical Language Learning
A Hybrid Approach to Web Usage Mining
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Web Mining: Information and Pattern Discovery on the World Wide Web
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
FS-Miner: efficient and incremental mining of frequent sequence patterns in web logs
Proceedings of the 6th annual ACM international workshop on Web information and data management
Evaluating Variable-Length Markov Chain Models for Analysis of User Web Navigation Sessions
IEEE Transactions on Knowledge and Data Engineering
A framework of combining Markov model with association rules for predicting web page accesses
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Recsplorer: recommendation algorithms based on precedence mining
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
An integrated model for next page access prediction
International Journal of Knowledge and Web Intelligence
A novel prediction model based on hierarchical characteristic of web site
Expert Systems with Applications: An International Journal
Improved usage model for web application reliability testing
ICTSS'11 Proceedings of the 23rd IFIP WG 6.1 international conference on Testing software and systems
Hi-index | 0.00 |
Web usage mining concerns the discovery of common browsing patterns, i.e., pages requested in sequence, from web logs. To cope with the enormous amounts of data, several aggregated structures based on statistical models of web surfing have appeared, e.g., the Hypertext Probabilistic Gramma(HPG) model [2]. These techniques typically rely on the Markov assumption with history depth n, i.e., it is assumed that the next requested page is only dependent on the last n pages visited. This is not always valid, i.e. false browsing patterns may be discovered. However, to our knowledge there has been no systematic study of the validity of the Markov assumption wrt. web usage mining and the resulting quality of the mined browsing patterns.In this paper we systematically investigate the quality of browsing patterns mined from structures based on the Markov assumption. Formal measures of quality, based on the closeness of the mined patterns to the true traversal patterns, are defined and an extensive experimental evaluation is performed, based on two substantial real-world data sets. The results indicate that a large number of rules must be considered to achieve high quality, that long rules are generally more distorted than shorter rules and that the model yield knowledge of a higher quality when applied to more random usage patterns. Thus we conclude that Markov-based structures for web usage mining are best suited for tasks demanding less accuracy such as pre-fetching, personalization, and targeted ads.