A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Accessibility of information on the Web
intelligence
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
Signature-Based Methods for Data Streams
Data Mining and Knowledge Discovery
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling personalized web search
WWW '03 Proceedings of the 12th international conference on World Wide Web
Query word deletion prediction
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Introduction to the special issue on computational linguistics using large corpora
Computational Linguistics - Special issue on using large corpora: I
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Adaptive web search based on user profile constructed without any effort from users
Proceedings of the 13th international conference on World Wide Web
The indexable web is more than 11.5 billion pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Context-sensitive information retrieval using implicit feedback
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Personalizing search via automated analysis of interests and activities
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
UCAIR: a personalized search toolbar
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Mining long-term search history to improve search accuracy
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Long Tail: Why the Future of Business Is Selling Less of More
The Long Tail: Why the Future of Business Is Selling Less of More
Examining the effectiveness of real-time query expansion
Information Processing and Management: an International Journal
ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
Claude E. Shannon: a retrospective on his life, work, and impact
IEEE Transactions on Information Theory
Understanding the relationship between searchers' queries and information goals
Proceedings of the 17th ACM conference on Information and knowledge management
Query suggestion using hitting time
Proceedings of the 17th ACM conference on Information and knowledge management
Proceedings of the 2008 ACM conference on Computer supported cooperative work
Discovering and using groups to improve personalized search
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Analysis of long queries in a large scale search log
Proceedings of the 2009 workshop on Web Search Click Data
An algorithm for analyzing personalized online commercial intention
Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
Stratified analysis of AOL query log
Information Sciences: an International Journal
Spatio-temporal models for estimating click-through rate
Proceedings of the 18th international conference on World wide web
What queries are likely to recur in web search?
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Web Observation from a User Perspective
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
The demographics of web search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Web search solved?: all result rankings the same?
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Inferring and using location metadata to personalize web search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Improving local search ranking through external logs
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Proceedings of the fifth ACM international conference on Web search and data mining
Finding trending local topics in search queries for personalization of a recommendation system
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Revisiting the predictability of language: response completion in social media
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Click patterns: an empirical representation of complex query intents
Proceedings of the 21st ACM international conference on Information and knowledge management
Enhancing personalized search by mining and modeling task behavior
Proceedings of the 22nd international conference on World Wide Web
Questions about questions: an empirical analysis of information needs on Twitter
Proceedings of the 22nd international conference on World Wide Web
A probabilistic mixture model for mining and analyzing product search log
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Mining search and browse logs for web search: A Survey
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Personalised Information Retrieval: survey and classification
User Modeling and User-Adapted Interaction
Investigating query bursts in a web search engine
Web Intelligence and Agent Systems
Hi-index | 0.00 |
How many pages are there on the Web? 5B? 20B? More? Less? Big bets on clusters in the clouds could be wiped out if a small cache of a few million urls could capture much of the value. Language modeling techniques are applied to MSN's search logs to estimate entropy. The perplexity is surprisingly small: millions, not billions. Entropy is a powerful tool for sizing challenges and opportunities. How hard is search? How hard are query suggestion mechanisms like auto-complete? How much does personalization help? All these difficult questions can be answered by estimation of entropy from search logs. What is the potential opportunity for personalization? In this paper, we propose a new way to personalize search, personalization with backoff. If we have relevant data for a particular user, we should use it. But if we don't, back off to larger and larger classes of similar users. As a proof of concept, we use the first few bytes of the IP address to define classes. The coefficients of each backoff class are estimated with an EM algorithm. Ideally, classes would be defined by market segments, demographics and surrogate variables such as time and geography