A comparison of sentence retrieval techniques
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Assessing multivariate Bernoulli models for information retrieval
ACM Transactions on Information Systems (TOIS)
The Evaluation of Sentence Similarity Measures
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Diversifying image search with user generated content
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Syntactic Query Models for Restatement Retrieval
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Improving ad relevance in sponsored search
Proceedings of the third ACM international conference on Web search and data mining
Highly frequent terms and sentence retrieval
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Improving sentence retrieval with an importance prior
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Query expansion for language modeling using sentence similarities
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Exploring accumulative query expansion for relevance feedback
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Effective sentence retrieval based on query-independent evidence
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Sentence Retrieval is the task of retrieving a relevant sentence in response to a query, a question, or a reference sentence. Tasks such as question answering, summarization, novelty detection, and information provenance make use of a sentence-retrieval module as a preprocessing step. The performance of these systems is dependent on the quality of the sentence-retrieval module. Other tasks such as information extraction and machine translation operate on sentences, either using them as training data, or as the unit of input or output (or both), and may benefit from sentence retrieval to build a training corpus, or as a post-processing step. In this thesis we begin by demonstrating that because sentences are much smaller than documents, the performance of typical document retrieval systems on the retrieval of sentences is significantly worse. We propose several solutions to the problem of sentence retrieval, and investigate these solutions the application areas of sentence retrieval for question answering, novelty detection, and information provenance. The context of a sentence affects its meaning, and we demonstrate that smoothing from the local context of the sentence improves retrieval when the collection to be retrieved from contains many documents of unknown relevance. We show that statistical translation models are appropriate for tasks where the sentence to be retrieved has many terms in common with the query, but still benefits from the addition of related terms and synonyms. We show that queries of very few terms benefit from the translation approach, which incorporates related terms into the query. We show that the family of language modeling approaches, which includes statistical translation models, is not effective for discriminating between sentences that uses the same vocabulary to express the same information, and sentences that use the same vocabulary to express new information. Finally, we demonstrate a conditional model for sentence retrieval for question answering, and show that it outperforms both the translation approaches and the baseline language-modeling approach.