Effective document presentation with a locality-based similarity heuristic
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Exploiting redundancy in question answering
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A shortest path dependency kernel for relation extraction
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A support vector method for optimizing average precision
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic graphical model for joint answer ranking in question answering
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Autonomously semantifying wikipedia
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Type nanotheories: a framework for term comparison
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proximity-based document representation for named entity retrieval
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
EntityRank: searching entities directly and holistically
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Learning to rank relational objects and its application to web search
Proceedings of the 17th international conference on World Wide Web
A new interpretation of average precision
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling framework for expert finding
Information Processing and Management: an International Journal
Numerical data integration for cooperative question-answering
KRAQ '06 Proceedings of the Workshop KRAQ'06 on Knowledge and Reasoning for Language Processing
Probabilistic models for expert finding
ECIR'07 Proceedings of the 29th European conference on IR research
Introduction to special issue on learning to rank for information retrieval
Information Retrieval
Extraction and approximation of numerical attributes from the Web
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Web-scale entity-relation search architecture
Proceedings of the 20th international conference companion on World wide web
SCAD: collective discovery of attribute values
Proceedings of the 20th international conference on World wide web
Data-based research at IIT Bombay
ACM SIGMOD Record
Hi-index | 0.00 |
Web search is increasingly exploiting named entities like persons, places, businesses, addresses and dates. Entity ranking is also of current interest at INEX and TREC. Numerical quantities are an important class of entities, especially in queries about prices and features related to products, services and travel. We introduce Quantity Consensus Queries (QCQs), where each answer is a tight quantity interval distilled from evidence of relevance in thousands of snippets. Entity search and factoid question answering have benefited from aggregating evidence from multiple promising snippets, but these do not readily apply to quantities. Here we propose two new algorithms that learn to aggregate information from multiple snippets. We show that typical signals used in entity ranking, like rarity of query words and their lexical proximity to candidate quantities, are very noisy. Our algorithms learn to score and rankquantity intervals directly, combining snippet quantity and snippet text information. We report on experiments using hundreds of QCQs with ground truth taken from TREC QA, Wikipedia Infoboxes, and other sources, leading to tens of thousands of candidate snippets and quantities. Our algorithms yield about 20% better MAP and NDCG compared to the best-known collective rankers, and are 35% better than scoring snippets independent of each other.