Automatic categorization of questions for user-interactive question answering

  • Authors:
  • Wanpeng Song;Liu Wenyin;Naijie Gu;Xiaojun Quan;Tianyong Hao

  • Affiliations:
  • Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloo ...;Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong and Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute, Suzhou, China;Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China and Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute, Suzhou, Chi ...;Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong;Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Question categorization, which suggests one of a set of predefined categories to a user's question according to the question's topic or content, is a useful technique in user-interactive question answering systems. In this paper, we propose an automatic method for question categorization in a user-interactive question answering system. This method includes four steps: feature space construction, topic-wise words identification and weighting, semantic mapping, and similarity calculation. We firstly construct the feature space based on all accumulated questions and calculate the feature vector of each predefined category which contains certain accumulated questions. When a new question is posted, the semantic pattern of the question is used to identify and weigh the important words of the question. After that, the question is semantically mapped into the constructed feature space to enrich its representation. Finally, the similarity between the question and each category is calculated based on their feature vectors. The category with the highest similarity is assigned to the question. The experimental results show that our proposed method achieves good categorization precision and outperforms the traditional categorization methods on the selected test questions.