Utilizing Semantic, Syntactic, and Question Category Information for Automated Digital Reference Services

Authors:
Palakorn Achananuparp;Xiaohua Hu;Xiaohua Zhou;Xiaodan Zhang
Affiliations:
College of Information Science and Technology, Drexel University, Philadelphia, PA 19104;College of Information Science and Technology, Drexel University, Philadelphia, PA 19104;College of Information Science and Technology, Drexel University, Philadelphia, PA 19104;College of Information Science and Technology, Drexel University, Philadelphia, PA 19104
Venue:
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Year:
2008

Citing 18
Cited 1

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Bridging the lexical chasm: statistical approaches to answer-finding

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Retrieving collocations from text: Xtract

Computational Linguistics - Special issue on using large corpora: I
Analysis of Statistical Question Classification for Fact-Based Questions

Information Retrieval
Learning question classifiers

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Finding similar questions in large question and answer archives

Proceedings of the 14th ACM international conference on Information and knowledge management
Similarity measures for tracking information flow

Proceedings of the 14th ACM international conference on Information and knowledge management
Interrogative reformulation patterns and acquisition of question paraphrases

PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Sentence Similarity Based on Semantic Nets and Corpus Statistics

IEEE Transactions on Knowledge and Data Engineering
Semantically enhanced user modeling

Proceedings of the 2007 ACM symposium on Applied computing
Aspects of sentence retrieval

Aspects of sentence retrieval
Automatically selecting answer templates to respond to customer emails

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Measuring the semantic similarity of texts

EMSEE '05 Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment
Similarity measures for short segments of text

ECIR'07 Proceedings of the 29th European conference on IR research

A new benchmark dataset with production methodology for short text semantic similarity algorithms

ACM Transactions on Speech and Language Processing (TSLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Digital reference services normally rely on human experts to provide quality answers to the user requests via online communication tools. As the services gain more popularity, more experts are needed to keep up with a growing demand. Alternatively, automated question answering module can help shorten the question-answering cycle. When the system receives a new user submitted question, the similarity of the user's request and the existing questions in the archive can be compared. If the appropriate match is found, the system then uses the associated answer to response to such request. Since a question is relatively short and two questions might contain very few words in common, the challenge is how to effectively identify the similarity of questions. In this paper, we focus on the problem of identifying questions that convey the similar information need. That is, our goal is to find paraphrases of the original questions. To achieve this, we propose a hybrid approach that combines semantic, syntactic, and question category to judge question similarity. Semantic and syntactic information is measured by taking into account word similarity, word order, and part of speech information. Information about the types of question is derived from a Support Vector Machine classifier. The experimental results demonstrate that our combined measures are highly effective in distinguishing original questions and their paraphrases, thus improving the potency of question matching task.