Performance issues and error analysis in an open-domain question answering system
ACM Transactions on Information Systems (TOIS)
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An analysis of the AskMSR question-answering system
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Question classification using HDAG kernel
MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
A language independent method for question classification
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hi-index | 0.00 |
Recently, some machine learning techniques like support vector machines are employed for question classification. However, these techniques heavily depend on the availability of large amounts of training data, and may suffer many difficulties while facing various new questions from the real users on the Web. To mitigate the problem of lacking sufficient training data, in this paper, we present a simple learning method that explores Web search results to collect more training data automatically by a few seed terms (question answers). In addition, we propose a novel semantically related feature model (SRFM), which takes advantage of question focuses and their semantically related features learned from the larger number of collected training data to support the determination of question type. Our experimental results show that the proposed new learning method can obtain better classification performance than the bigram language modeling (LM) approach for the questions with untrained question focuses.