Recommending Multidimensional Queries
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Towards a graph-based user profile modeling for a session-based personalized search
Knowledge and Information Systems
Integrating multiple document features in language models for expert finding
Knowledge and Information Systems
Identifying the optimal set of parameters for new topic identification through experimental design
Expert Systems with Applications: An International Journal
Mining and modeling database user access patterns
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Hi-index | 0.00 |
A database session is a sequence of requests presented to the database system by a user or an application to achieve a certain task. Session identification is an important step in discovering useful patterns from database trace logs. The discovered patterns can be used to improve the performance of database systems by prefetching predicted queries, rewriting the current query or conducting effective cache replacement.In this paper, we present an application of a new session identification method based on statistical language modeling to database trace logs. Several problems of the language modeling based method are revealed in the application, which include how to select values for the parameters of the language model, how to evaluate the accuracy of the session identification result and how to learn a language model without well-labeled training data. All of these issues are important in the successful application of the language modeling based method for session identification. We propose solutions to these open issues. In particular, new methods for determining an entropy threshold and the order of the language model are proposed. New performance measures are presented to better evaluate the accuracy of the identified sessions. Furthermore, three types of learning methods, namely, learning from labeled data, learning from semi-labeled data and learning from unlabeled data, are introduced to learn language models from different types of training data. Finally, we report experimental results that show the effectiveness of the language model based method for identifying sessions from the trace logs of an OLTP database application and the TPC-C Benchmark.