Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Finding similar questions in large question and answer archives
Proceedings of the 14th ACM international conference on Information and knowledge management
Retrieval models for question and answer archives
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Statistical Language Models for Information Retrieval
Statistical Language Models for Information Retrieval
The use of categorization information in language models for question retrieval
Proceedings of the 18th ACM conference on Information and knowledge management
Understanding user intent in community question answering
Proceedings of the 21st international conference companion on World Wide Web
Improving search relevance for short queries in community question answering
Proceedings of the 7th ACM international conference on Web search and data mining
Information Sciences: an International Journal
Hi-index | 0.00 |
Community Question Answering (CQA) services, such as Yahoo! Answers and WikiAnswers, have become popular with users as one of the central paradigms for satisfying users' information needs. The task of question retrieval in CQA aims to resolve one's query directly by finding the most relevant questions (together with their answers) from an archive of past questions. However, as users can ask any question that they like, a large number of questions in CQA are not about objective (factual) knowledge, but about subjective (sentiment-based) opinions or social interactions. The inhomogeneous nature of CQA leads to reduced performance of standard retrieval models. To address this problem, we present a hybrid approach that blends several language modelling techniques for question retrieval, namely, the classic (query-likelihood) language model, the state-of-the-art translation-based language model, and our proposed intent-based language model. The user intent of each candidate question (objective/subjective/social) is given by a probabilistic classifier which makes use of both textual features and metadata features. Our experiments on two real-world datasets show that our approach can significantly outperform existing ones.