Learning to select the correct answer in multi-stream question answering
Information Processing and Management: an International Journal
Hi-index | 0.02 |
In this paper we present the experiments performed at Tokyo Institute of Technology for the CLEF2006 Multiple Language Question Answering (QA@CLEF) track. Our approach to QA centres on a non-linguistic, data-driven, statistical classification model that uses the redundancy of the web to find correct answers. For the cross-language aspect we employed publicly available web-based text translation tools to translate the question from the source into the corresponding target language, then used the corresponding mono-lingual QA system to find the answers. The hypothesised correct answers were then projected back on to the appropriate closed-domain corpus. Correct and supported answer performance on the mono-lingual tasks was around 14% for both Spanish and French. Performance on the cross-language tasks ranged from 5% for Spanish-English, to 12% for French-Spanish. Our method of projecting answers onto documents was shown not to work well: in the worst case on the French-English task we lost 84% of our otherwise correct answers. Ignoring the need for correct support information the exact answer accuracy increased to 29% and 21% correct on the Spanish and French mono-lingual tasks, respectively.