Evaluating text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Patterns of search: analyzing and modeling Web query refinement
UM '99 Proceedings of the seventh international conference on User modeling
On Relevance, Probabilistic Indexing and Information Retrieval
Journal of the ACM (JACM)
Analysis of a very large web search engine query log
ACM SIGIR Forum
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Searching the Web: the public and their queries
Journal of the American Society for Information Science and Technology
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Using navigation data to improve IR functions in the context of web search
Proceedings of the tenth international conference on Information and knowledge management
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Probabilistic query expansion using query logs
Proceedings of the 11th international conference on World Wide Web
Vox populi: the public searching of the Web
Journal of the American Society for Information Science and Technology
Multitasking information seeking and searching processes
Journal of the American Society for Information Science and Technology
Combining evidence for automatic web session identification
Information Processing and Management: an International Journal - Issues of context in information retrieval
Enriching web taxonomies through subject categorization of query terms from search engine logs
Decision Support Systems - Web retrieval and mining
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
U.S. versus European web searching trends
ACM SIGIR Forum
Journal of the American Society for Information Science and Technology
Query Expansion by Mining User Logs
IEEE Transactions on Knowledge and Data Engineering
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
Optimizing web search using web click-through data
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A practical web-based approach to generating topic hierarchy for text segments
Proceedings of the thirteenth ACM international conference on Information and knowledge management
An analysis of web searching by European AlltheWeb.com users
Information Processing and Management: an International Journal
A temporal comparison of AltaVista Web searching: Research Articles
Journal of the American Society for Information Science and Technology
Semantic similarity between search engine queries using temporal correlation
WWW '05 Proceedings of the 14th international conference on World Wide Web
Four scorers and seven years ago: the scoring method for MUC-6
MUC6 '95 Proceedings of the 6th conference on Message understanding
Application of automatic topic identification on excite web search engine data logs
Information Processing and Management: an International Journal
Accurately interpreting clickthrough data as implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Web-page summarization using clickthrough data
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Online access AIDS for documentation: a bibliographic outline
ACM SIGIR Forum
Implicit user modeling for personalized search
Proceedings of the 14th ACM international conference on Information and knowledge management
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Mining related queries from search engine query logs
Proceedings of the 15th international conference on World Wide Web
How are we searching the world wide web?: a comparison of nine search engine transaction logs
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Multitasking during web search sessions
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Automatic new topic identification using multiple linear regression
Information Processing and Management: an International Journal
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An exploratory web log study of multitasking
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ACM Transactions on Internet Technology (TOIT)
Defining a session on Web search engines: Research Articles
Journal of the American Society for Information Science and Technology
Sliding window technique for the web log analysis
Proceedings of the 16th international conference on World Wide Web
Cross-validation of neural network applications for automatic new topic identification
Journal of the American Society for Information Science and Technology
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Models of searching and browsing: languages, studies, and applications
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Similarity measures for short segments of text
ECIR'07 Proceedings of the 29th European conference on IR research
Automatic task detection in the web logs and analysis of multitasking
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
On caching search engine query results
Computer Communications
Web search solved?: all result rankings the same?
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Utilizing variability of time and term content, within and across users in session detection
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Information Sciences: an International Journal
Modeling answerer behavior in collaborative question answering systems
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Query session detection as a cascade
Proceedings of the 20th ACM international conference on Information and knowledge management
Context-aware search personalization with concept preference
Proceedings of the 20th ACM international conference on Information and knowledge management
An analysis of free-text queries for a multi-field web form
Proceedings of the 4th Information Interaction in Context Symposium
On extracting session data from activity logs
Proceedings of the 5th Annual International Systems and Storage Conference
User's Behaviour inside a Digital Library
International Journal of Decision Support System Technology
Query Recommendation for Improving Search Engine Results
International Journal of Information Retrieval Research
Journal of Web Engineering
From search session detection to search mission detection
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Discovering tasks from search engine query logs
ACM Transactions on Information Systems (TOIS)
Learning to detect task boundaries of query session
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.07 |
Search engine logs provide a highly detailed insight of users' interactions. Hence, they are both extremely useful and sensitive. The datasets publicly available to scholars are, unfortunately, too few, too dated and too small. There are few because search engine companies are reluctant to release such data; they are dated because they were collected in late 1990s or early 2000s; and they are small because they comprise data for at most one day and just a few hundreds of thousands of users. Even worse, the large query log disclosed by AOL in 2006 caused more harm than good because of a big privacy flaw. In this paper the author provides an overall view of the possible applications of query logs, the privacy concerns researchers must face when working on such datasets, and several ways in which query logs can be easily sanitized. One of such measures consists of segmenting the logs into short topical sessions. Therefore, the author offers a comprehensive survey of session detection methods, as well as a thorough description of a new evaluation framework with performance results for each of the different methods. Additionally, a new, simple, but outperforming session detection method is proposed. It is a heuristic-based technique which works on the basis of a geometric interpretation of both the time gap between queries and the similarity between them in order to flag a topic shift.