Characterizing browsing strategies in the World-Wide Web
Proceedings of the Third International World-Wide Web conference on Technology, tools and applications
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
WUM - A Tool for WWW Ulitization Analysis
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Web usage mining: discovery and applications of usage patterns from Web data
ACM SIGKDD Explorations Newsletter
Web Mining: Information and Pattern Discovery on the World Wide Web
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Efficient and Anonymous Web-Usage Mining for Web Personalization
INFORMS Journal on Computing
A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis
INFORMS Journal on Computing
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Graph mining: Laws, generators, and algorithms
ACM Computing Surveys (CSUR)
Mining Web Usage Data for Discovering Navigation Clusters
ISCC '06 Proceedings of the 11th IEEE Symposium on Computers and Communications
The Web as a graph: How far we are
ACM Transactions on Internet Technology (TOIT)
Active User-Based and Ontology-Based Web Log Data Preprocessing for Web Usage Mining
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A Fuzzy Markov Model Approach for Predicting User Navigation
WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
A Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites
IEEE Transactions on Knowledge and Data Engineering
Discrete Applied Mathematics
Generating dynamic higher-order markov models in web usage mining
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Parallelizing Random Walk with Restart for large-scale query recommendation
Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
Association-rules-based recommender system for personalization in adaptive web-based applications
ICWE'10 Proceedings of the 10th international conference on Current trends in web engineering
Identifying breakpoints in public opinion
Proceedings of the First Workshop on Social Media Analytics
Semantically enriched event based model for web usage mining
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Applying web usage mining for adaptive intranet navigation
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Web personalization based on usage mining
FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Modeling Data for Enterprise Systems with Memories
Journal of Database Management
Hi-index | 0.00 |
In this paper, we propose a novel framework called Smart-Miner for web usage mining problem which uses link information for producing accurate user sessions and frequent navigation patterns. Unlike the simple session concepts in the time and navigation based approaches, where sessions are sequences of web pages requested from the server or viewed in the browser, Smart Miner sessions are set of paths traversed in the web graph that corresponds to users' navigations among web pages. We have modeled session construction as a new graph problem and utilized a new algorithm, Smart-SRA, to solve this problem efficiently. For the pattern discovery phase, we have developed an efficient version of the Apriori-All technique which uses the structure of web graph to increase the performance. From the experiments that we have performed on both real and simulated data, we have observed that Smart-Miner produces at least 30% more accurate web usage patterns than other approaches including previous session construction methods. We have also studied the effect of having the referrer information in the web server logs to show that different versions of Smart-SRA produce similar results. Our another contribution is that we have implemented distributed version of the Smart Miner framework by employing Map/Reduce Paradigm. We conclude that we can efficiently process terabytes of web server logs belonging to multiple web sites by our scalable framework.