Ranking Answers by Hierarchical Topic Models

Authors:
Zengchang Qin;Marcus Thint;Zhiheng Huang
Affiliations:
BISC Group, EECS Department, University of California Berkeley, USA;Computational Intelligence Group, Intelligent Systems Lab, BT Group, UK;BISC Group, EECS Department, University of California Berkeley, USA
Venue:
IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
Year:
2009

Citing 8
Cited 2

WordNet: a lexical database for English

Communications of the ACM
Learning in graphical models

Learning in graphical models
Latent dirichlet allocation

The Journal of Machine Learning Research
Lucene in Action (In Action series)

Lucene in Action (In Action series)
Dynamic topic models

ICML '06 Proceedings of the 23rd international conference on Machine learning
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Deduction Engine Design for PNL-Based Question Answering System

IFSA '07 Proceedings of the 12th international Fuzzy Systems Association world congress on Foundations of Fuzzy Logic and Soft Computing
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

What is the basic semantic unit of Chinese language? a computational approach based on topic models

MOL'11 Proceedings of the 12th biennial conference on The mathematics of language
An efficient minimum vocabulary construction algorithm for language modeling

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Topic models are hierarchical probabilistic models for the statistical analysis of document collections. It assumes that each document comprises a mixture of latent topics and each topic can be represented by a distribution over vocabulary. Dimensionality for a large corpus of unstructured documents can be reduced by modeling with these exchangeable topics. In previous work, we designed a multi-pipe structure for question answering (QA) systems by nesting keyword search, classical Natural Language Processing (NLP) techniques and prototype detections. In this research, we use those technologies to select a set of sentences as candidate answers. We then use topic models to rank these candidate answers by calculating the semantic distances between these sentences and the given query. In our experiments, we found that the new model of using topic models improves the answer ranking so that the better answers can returned for the given query.