Characterizing browsing strategies in the World-Wide Web
Proceedings of the Third International World-Wide Web conference on Technology, tools and applications
Comparison of interestingness functions for learning web usage patterns
Proceedings of the eleventh international conference on Information and knowledge management
Efficient Data Mining for Path Traversal Patterns
IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Log Mining to Improve the Performance of Site Search
WISEW '02 Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops) - (WISEw'02)
Discovery of Interesting Association Rules from Livelink Web Log Data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Web-Log Mining for Predictive Web Caching
IEEE Transactions on Knowledge and Data Engineering
Clustering Web Surfers with Probabilistic Models in a Real Application
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Context-sensitive information retrieval using implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Implicit user modeling for personalized search
Proceedings of the 14th ACM international conference on Information and knowledge management
Validation and interpretation of Web users' sessions clusters
Information Processing and Management: an International Journal
Website usage metrics: A re-assessment of session data
Information Processing and Management: an International Journal
Ethical aspects of web log data mining
International Journal of Information Technology and Management
Automatic request categorization in internet services
ACM SIGMETRICS Performance Evaluation Review
Identifying clusters of user behavior in intranet search engine log files
Journal of the American Society for Information Science and Technology
Empirical Software Engineering
Empirical observations on the session timeout threshold
Information Processing and Management: an International Journal
Users, Queries and Documents: A Unified Representation for Web Mining
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Towards a graph-based user profile modeling for a session-based personalized search
Knowledge and Information Systems
Identifying the optimal set of parameters for new topic identification through experimental design
Expert Systems with Applications: An International Journal
A unified representation of web logs for mining applications
Information Retrieval
Building content clusters based on modelling page pairs
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A machine learning approach to identifying database sessions using unlabeled data
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Discovering characteristic individual accessing behaviors in web environment
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Finding and analyzing database user sessions
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
We present a novel session identification method based on statistical language modeling. Unlike standard time-out methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three standard session identification method--the standard timeout method, the reference length method, and the maximal forward reference method--and find that our statistical language modeling approach generally yields superior results. However, as with every method, the performance of our technique varies with changing parameter settings. Therefore, we also analyze the influence of the two key factors in our language-modeling-based approach: the choice of smoothing technique and the language model order. We find that all standard smoothing techniques, save one, perform well, and that performance is robust to language model order.