Relevance weighting of search terms
Document retrieval systems
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Floating search methods in feature selection
Pattern Recognition Letters
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Multimedia Information Retrieval: Content-Based Information Retrieval from Large Text and Audio Databases
Modern Information Retrieval
Introduction to Algorithms
Context and Page Analysis for Improved Web Search
IEEE Internet Computing
ACM SIGIR Forum
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
Improving web search results using affinity graph
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search ranking by incorporating user behavior information
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Less is more: probabilistic models for retrieving fewer relevant documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Metadata-aware measures for answer summarization in community Question Answering
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Inducing word senses to improve web search result clustering
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Clustering web search results with maximum spanning trees
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Beyond precision@10: clustering the long tail of web search results
Proceedings of the 20th ACM international conference on Information and knowledge management
Large-margin learning of submodular summarization models
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Web Intelligence and Agent Systems
Hi-index | 0.00 |
Results to Web search queries are ranked using heuristics that typically analyze the global link topology, user behavior, and content relevance. We point to a particular inefficiency of such methods: information redundancy. In queries where learning about a subject is an objective, modern search engines return relatively unsatisfactory results as they consider the query coverage by each page individually, not a set of pages as a whole. We address this problem using essential pages. If we denote as $\mathbb{S}_Q$ the total knowledge that exists on the Web about a given query $Q$, we want to build a search engine that returns a set of essential pages $E_Q$ that maximizes the information covered over $\mathbb{S}_Q$. We present a preliminary prototype that optimizes the selection of essential pages; we draw some informal comparisons with respect to existing search engines; and finally, we evaluate our prototype using a blind-test user study.