Mining search and browse logs for web search: A Survey

Authors:
Daxin Jiang;Jian Pei;Hang Li
Affiliations:
Microsoft Corporation;Simon Fraser University;Huawei Technologies
Venue:
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Year:
2013

Citing 121
Cited 0

Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Analysis of a very large web search engine query log

ACM SIGIR Forum
Web search behavior of Internet experts and newbies

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Agglomerative clustering of a search engine query log

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A Multilinear Singular Value Decomposition

SIAM Journal on Matrix Analysis and Applications
A vector space model for automatic indexing

Communications of the ACM
Searching the Web: the public and their queries

Journal of the American Society for Information Science and Technology
A review of web searching studies and a framework for future research

Journal of the American Society for Information Science and Technology
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic query expansion using query logs

Proceedings of the 11th international conference on World Wide Web
Topic-sensitive PageRank

Proceedings of the 11th international conference on World Wide Web
Vox populi: the public searching of the Web

Journal of the American Society for Information Science and Technology
Modern Information Retrieval

Modern Information Retrieval
Personalized search

Communications of the ACM
Personalized web search by mapping user queries to categories

Proceedings of the eleventh international conference on Information and knowledge management
From E-Sex to E-Commerce: Web Search Changes

Computer
Combining evidence for automatic web session identification

Information Processing and Management: an International Journal - Issues of context in information retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A taxonomy of web search

ACM SIGIR Forum
Relevant term suggestion in interactive web search based on contextual information in query session logs

Journal of the American Society for Information Science and Technology
Ontology Based Personalized Search

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Survey of Text Mining

Survey of Text Mining
Web mining in search engines

ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Understanding user goals in web search

Proceedings of the 13th international conference on World Wide Web
Identifying similarities, periodicities and bursts for online search queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Hourly analysis of a very large topically categorized web query log

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Web usage mining based on probabilistic latent semantic analysis

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Optimizing web search using web click-through data

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Evaluating implicit measures to improve web search

ACM Transactions on Information Systems (TOIS)
Semantic similarity between search engine queries using temporal correlation

WWW '05 Proceedings of the 14th international conference on World Wide Web
CubeSVD: a novel approach to personalized Web search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Automatic identification of user goals in Web search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Context-sensitive information retrieval using implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Personalizing search via automated analysis of interests and activities

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Personalized Search Based on User Search Histories

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Concept-based interactive query expansion

Proceedings of the 14th ACM international conference on Information and knowledge management
Implicit user modeling for personalized search

Proceedings of the 14th ACM international conference on Information and knowledge management
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Automatic identification of user interest for personalized search

Proceedings of the 15th international conference on World Wide Web
Learning user interaction models for predicting web search result preferences

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search ranking by incorporating user behavior information

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Building bridges for web query classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Mining long-term search history to improve search accuracy

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A large-scale analysis of query logs for assessing personalization opportunities

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Query enrichment for web-query classification

ACM Transactions on Information Systems (TOIS)
Temporal analysis of a very large topically categorized Web query log

Journal of the American Society for Information Science and Technology
Automatic classification of Web queries using very large unlabeled query logs

ACM Transactions on Information Systems (TOIS)
Defining a session on Web search engines: Research Articles

Journal of the American Society for Information Science and Technology
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds

Proceedings of the 16th international conference on World Wide Web
A large-scale evaluation and analysis of personalized search strategies

Proceedings of the 16th international conference on World Wide Web
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Information re-retrieval: repeat queries in Yahoo's logs

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Random walks on the click graph

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Active exploration for learning rankings from clickthrough data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Entropy of search logs: how hard is search? with personalization? with backoff?

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Query-sets: using implicit feedback and query patterns to organize web documents

Proceedings of the 17th international conference on World Wide Web
Using the wisdom of the crowds for keyword generation

Proceedings of the 17th international conference on World Wide Web
Spatial variation in search engine queries

Proceedings of the 17th international conference on World Wide Web
To personalize or not to personalize: modeling queries with variation in user intent

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A user browsing model to predict search engine click data from past observations.

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning query intent from regularized click graphs

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A unified and discriminative model for query refinement

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
BrowseRank: letting web users vote for page importance

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Automatically identifying localizable queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Context-aware query suggestion by mining click-through and session data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Are click-through data adequate for learning web search rankings?

Proceedings of the 17th ACM conference on Information and knowledge management
Query suggestion using hitting time

Proceedings of the 17th ACM conference on Information and knowledge management
The query-flow graph: model and applications

Proceedings of the 17th ACM conference on Information and knowledge management
Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs

Proceedings of the 17th ACM conference on Information and knowledge management
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Mining user web search activity with layered bayesian networks or how to capture a click in its context

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Analysis of long queries in a large scale search log

Proceedings of the 2009 workshop on Web Search Click Data
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice
A dynamic bayesian network click model for web search ranking

Proceedings of the 18th international conference on World wide web
Click chain model in web search

Proceedings of the 18th international conference on World wide web
Towards context-aware search by learning a very large variable length hidden markov model from search logs

Proceedings of the 18th international conference on World wide web
Understanding user's query intent with wikipedia

Proceedings of the 18th international conference on World wide web
Discovering users' specific geo intention in web search

Proceedings of the 18th international conference on World wide web
BBM: bayesian browsing model from petabyte-scale data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining rich session context to improve web search

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Smoothing clickthrough data for web search ranking

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Predicting user interests from contextual information

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Web-derived resources for web information retrieval: from conceptual hierarchies to attribute hierarchies

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Minimally invasive randomization for collecting unbiased preferences from clickthrough logs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
How are we searching the World Wide Web? A comparison of nine search engine transaction logs

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Towards recency ranking in web search

Proceedings of the third ACM international conference on Web search and data mining
Large scale query log analysis of re-finding

Proceedings of the third ACM international conference on Web search and data mining
Inferring search behaviors using partially observable Markov (POM) model

Proceedings of the third ACM international conference on Web search and data mining
Beyond DCG: user behavior as a predictor of a successful search

Proceedings of the third ACM international conference on Web search and data mining
Mining Query Logs: Turning Search Usage Data into Knowledge

Foundations and Trends in Information Retrieval
Context-aware ranking in web search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Query representation and understanding workshop

ACM SIGIR Forum
Who uses web search for what: and how

Proceedings of the fourth ACM international conference on Web search and data mining
Personalizing web search using long term browsing history

Proceedings of the fourth ACM international conference on Web search and data mining
Understanding temporal query dynamics

Proceedings of the fourth ACM international conference on Web search and data mining
Identifying task-based sessions in search engine query logs

Proceedings of the fourth ACM international conference on Web search and data mining
Multidimensional mining of large-scale search logs: a topic-concept cube approach

Proceedings of the fourth ACM international conference on Web search and data mining
Learning similarity function for rare queries

Proceedings of the fourth ACM international conference on Web search and data mining
Characterizing search intent diversity into click models

Proceedings of the 20th international conference on World wide web
Improving recommendation for long-tail queries via templates

Proceedings of the 20th international conference on World wide web
Query segmentation revisited

Proceedings of the 20th international conference on World wide web
Online spelling correction for query completion

Proceedings of the 20th international conference on World wide web
People searching for people: analysis of a people search engine log

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Learning search tasks in queries and web pages via graph regularization

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Unsupervised query segmentation using clickthrough for information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Learning to Rank for Information Retrieval and Natural Language Processing

Learning to Rank for Information Retrieval and Natural Language Processing
Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion

ACM Transactions on Intelligent Systems and Technology (TIST)
The intention behind web queries

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Evaluating the effectiveness of search task trails

Proceedings of the 21st international conference on World Wide Web
Learning to suggest: a machine learning framework for ranking query suggestions

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling the impact of short- and long-term behavior on search personalization

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A semi-supervised approach to modeling web search satisfaction

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Mining query subtopics from search log data

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Time-sensitive query auto-completion

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A generalized hidden Markov model with discriminative training for query spelling correction

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Towards optimum query segmentation: in doubt without

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Huge amounts of search log data have been accumulated at Web search engines. Currently, a popular Web search engine may receive billions of queries and collect terabytes of records about user search behavior daily. Beside search log data, huge amounts of browse log data have also been collected through client-side browser plugins. Such massive amounts of search and browse log data provide great opportunities for mining the wisdom of crowds and improving Web search. At the same time, designing effective and efficient methods to clean, process, and model log data also presents great challenges. In this survey, we focus on mining search and browse log data for Web search. We start with an introduction to search and browse log data and an overview of frequently-used data summarizations in log mining. We then elaborate how log mining applications enhance the five major components of a search engine, namely, query understanding, document understanding, document ranking, user understanding, and monitoring and feedback. For each aspect, we survey the major tasks, fundamental principles, and state-of-the-art methods.