BBM: bayesian browsing model from petabyte-scale data

Authors:
Chao Liu;Fan Guo;Christos Faloutsos
Affiliations:
Microsoft Research, Redmond, WA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2009

Citing 28
Cited 17

Analysis of a very large web search engine query log

ACM SIGIR Forum
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Expectation Propagation for approximate Bayesian inference

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Optimizing web search using web click-through data

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Learning to Rank

Information Retrieval
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining data streams: a review

ACM SIGMOD Record
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Mining search engine query logs for query recommendation

Proceedings of the 15th international conference on World Wide Web
Learning user interaction models for predicting web search result preferences

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search ranking by incorporating user behavior information

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A large-scale analysis of query logs for assessing personalization opportunities

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search

ACM Transactions on Information Systems (TOIS)
Predicting clicks: estimating the click-through rate for new ads

Proceedings of the 16th international conference on World Wide Web
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
FRank: a ranking method with fidelity loss

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An experimental comparison of click position-bias models

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Mining the search trails of surfing crowds: identifying relevant websites from user activity

Proceedings of the 17th international conference on World Wide Web
A user browsing model to predict search engine click data from past observations.

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Efficient multiple-click models in web search

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Tailoring click models to user goals

Proceedings of the 2009 workshop on Web Search Click Data
Investigating the effectiveness of clickthrough data for document reordering

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Constructing free-energy approximations and generalized belief propagation algorithms

IEEE Transactions on Information Theory

Temporal click model for sponsored search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
User browsing models: relevance versus examination

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Characterizing search intent diversity into click models

Proceedings of the 20th international conference on World wide web
Diversified ranking on large graphs: an optimization viewpoint

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
GBASE: a scalable and general graph management system

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
User-click modeling for understanding and predicting search-behavior

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating revisiting behaviors into click models

Proceedings of the fifth ACM international conference on Web search and data mining
A noise-aware click model for web search

Proceedings of the fifth ACM international conference on Web search and data mining
Personalized click model through collaborative filtering

Proceedings of the fifth ACM international conference on Web search and data mining
Beyond ten blue links: enabling user click modeling in federated web search

Proceedings of the fifth ACM international conference on Web search and data mining
gbase: an efficient analysis platform for large graphs

The VLDB Journal — The International Journal on Very Large Data Bases
Do ads compete or collaborate?: designing click models with full relationship incorporated

Proceedings of the 21st ACM international conference on Information and knowledge management
Big graph mining: algorithms and discoveries

ACM SIGKDD Explorations Newsletter
TellMyRelevance!: predicting the relevance of web search results from cursor interactions

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Mining search and browse logs for web search: A Survey

ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Exploiting contextual factors for click modeling in sponsored search

Proceedings of the 7th ACM international conference on Web search and data mining
Estimating ad group performance in sponsored search

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new modeling technique with following advantages: (a) it does exact inference; (b) it is single-pass and parallelizable; (c) it is effective. We present two sets of experiments to test model effectiveness and efficiency. On the first set of over 50 million search instances of 1.1 million distinct queries, BBM out-performs the state-of-the-art competitor by 29.2% in log-likelihood while being 57 times faster. On the second click-log set, spanning a quarter of petabyte data, we showcase the scalability of BBM: we implemented it on a commercial MapReduce cluster, and it took only 3 hours to compute the relevance for 1.15 billion distinct query-URL pairs.