Scaling question answering to the Web
Proceedings of the 10th international conference on World Wide Web
Question answering from the web using knowledge annotation and knowledge mining techniques
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Principle-based parsing without overgeneration
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
The role of information retrieval in answering complex questions
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
WordNet::Similarity: measuring the relatedness of concepts
HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
A re-examination of IR techniques in QA system
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Passage retrieval in log files: an approach based on query enrichment
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Hi-index | 0.00 |
This paper presents our experiments with a low-frequency approach to information retrieval for question answering over a small, closed domain corpus and a variety of question types. With a corpus of 255 questions categorized into simple, average and challenging, we compared the performance of our question answering system (QASCU) when used with two different information retrieval systems, Lucene and BioKI. Lucene uses a standard tf.idf weighting scheme on documents, while BioKI uses a weighted keyword occurrence optimization scheme on paragraphs, that does not bias against low-frequency terms. While IR with Lucene yields better IR results at the document level than BioKI, running QASCU on BioKI output achieves higher precision. This indicates that for closed domain QA with an IR component, the basic F-measure performance of the IR component at the document level is not necessarily indicative of the overall performance. We contend that the findings are relevant also to retrieval from video, text, and sound collections that usually feature low redundancy in the text snippets used for retrieval.