Learning to rank for quantity consensus queries

Authors:
Somnath Banerjee;Soumen Chakrabarti;Ganesh Ramakrishnan
Affiliations:
HP Labs India, Bangalore, India;IIT Bombay, Mumbai, India;IIT Bombay, Mumbai, India
Venue:
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Year:
2009

Citing 16
Cited 5

Effective document presentation with a locality-based similarity heuristic

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Snowball: extracting relations from large plain-text collections

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Exploiting redundancy in question answering

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A shortest path dependency kernel for relation extraction

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A support vector method for optimizing average precision

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic graphical model for joint answer ranking in question answering

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Autonomously semantifying wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Type nanotheories: a framework for term comparison

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proximity-based document representation for named entity retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
EntityRank: searching entities directly and holistically

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Learning to rank relational objects and its application to web search

Proceedings of the 17th international conference on World Wide Web
A new interpretation of average precision

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling framework for expert finding

Information Processing and Management: an International Journal
Numerical data integration for cooperative question-answering

KRAQ '06 Proceedings of the Workshop KRAQ'06 on Knowledge and Reasoning for Language Processing
Probabilistic models for expert finding

ECIR'07 Proceedings of the 29th European conference on IR research

Introduction to special issue on learning to rank for information retrieval

Information Retrieval
Extraction and approximation of numerical attributes from the Web

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Web-scale entity-relation search architecture

Proceedings of the 20th international conference companion on World wide web
SCAD: collective discovery of attribute values

Proceedings of the 20th international conference on World wide web
Data-based research at IIT Bombay

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web search is increasingly exploiting named entities like persons, places, businesses, addresses and dates. Entity ranking is also of current interest at INEX and TREC. Numerical quantities are an important class of entities, especially in queries about prices and features related to products, services and travel. We introduce Quantity Consensus Queries (QCQs), where each answer is a tight quantity interval distilled from evidence of relevance in thousands of snippets. Entity search and factoid question answering have benefited from aggregating evidence from multiple promising snippets, but these do not readily apply to quantities. Here we propose two new algorithms that learn to aggregate information from multiple snippets. We show that typical signals used in entity ranking, like rarity of query words and their lexical proximity to candidate quantities, are very noisy. Our algorithms learn to score and rankquantity intervals directly, combining snippet quantity and snippet text information. We report on experiments using hundreds of QCQs with ground truth taken from TREC QA, Wikipedia Infoboxes, and other sources, leading to tens of thousands of candidate snippets and quantities. Our algorithms yield about 20% better MAP and NDCG compared to the best-known collective rankers, and are 35% better than scoring snippets independent of each other.