Question terminology and representation for question type classification

Authors:
Noriko Tomuro
Affiliations:
DePaul University, Chicago, IL
Venue:
COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Year:
2002

Citing 8
Cited 3

C4.5: programs for machine learning

C4.5: programs for machine learning
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Machine Learning
Genre Classification and Domain Transfer for Information Filtering

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Learning Subjective Adjectives from Corpora

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Automatic detection of text genre

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Toward semantics-based answer pinpointing

HLT '01 Proceedings of the first international conference on Human language technology research

Domain-specific FAQ retrieval using independent aspects

ACM Transactions on Asian Language Information Processing (TALIP)
Question type classification using a part-of-speech hierarchy

AIS'11 Proceedings of the Second international conference on Autonomous and intelligent systems
Toward Automatic Answers in User-Interactive Question Answering Systems

International Journal of Software Science and Computational Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Question terminology is a set of terms which appear in keywords, idioms and fixed expressions commonly observed in questions. This paper investigates ways to automatically extract question terminology from a corpus of questions and represent them for the purpose of classifying by question type. Our key interest is to see whether or not semantic features can enhance the representation of strongly lexical nature of question sentences. We compare two feature sets: one with lexical features only, and another with a mixture of lexical and semantic features. For evaluation, we measure the classification accuracy made by two machine learning algorithms, C5.0 and PEBLS, by using a procedure called domain cross-validation, which effectively measures the domain transferability of features.