A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
The Journal of Machine Learning Research
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents
ACM Transactions on Asian Language Information Processing (TALIP)
LDA-based document models for ad-hoc retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A lattice-based approach to query-by-example spoken document retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Language Models for Information Retrieval
Statistical Language Models for Information Retrieval
Word Topic Models for Spoken Document Retrieval and Transcription
ACM Transactions on Asian Language Information Processing (TALIP)
Latent topic modelling of word co-occurence information for spoken document retrieval
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Expectation-propagation for the generative aspect model
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
Topic modeling for information retrieval (IR) has attracted significant attention and demonstrated good performance in a wide variety of tasks over the years. In this article, we first present a comprehensive comparison among various topic modeling approaches, including the so-called document topic models (DTM) and word topic models (WTM), for Chinese spoken document retrieval (SDR). Moreover, in order to lessen SDR performance degradation when using imperfect recognition transcripts, we also leverage different levels of indexing features for topic modeling, including words, syllable-level units and their combinations. All the experiments are performed on the TDT Chinese collection.