Characterizing browsing strategies in the World-Wide Web
Proceedings of the Third International World-Wide Web conference on Technology, tools and applications
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Interesting Association Rules from Livelink Web Log Data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Combining naive bayes and n-gram language models for text classification
ECIR'03 Proceedings of the 25th European conference on IR research
Hi-index | 0.00 |
We present a statistical method using n-gram language models to identify session boundaries in a large collection of Livelink log data. The identified sessions are then used for association rule learning. Unlike the traditional ad hoc timeout method, which uses fixed time thresholds for session identification, our method uses an information theoretic approach that provides a natural technique for performing dynamic session identification. The effectiveness of our approach is evaluated with respect to 4 different interestingness measures. We find that we obtain a significant improvement in each interestingness measure, ranging from a 26.6% to 39% improvement on average over the best results obtained with standard timeout methods.