A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents

Authors:
Berlin Chen;Hsin-Min Wang;Lin-Shan Lee
Affiliations:
National Taiwan Normal University, Taipei, Taiwan;Academia Sinica, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2004

Citing 21
Cited 7

Statistical methods for speech recognition

Statistical methods for speech recognition
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Phonetic confusion matrix based spoken document retrieval

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Indexing and retrieval of broadcast news

Speech Communication - Special issue on accessing information in spoken audio
Spoken document representations for probabilistic retrieval

Speech Communication - Special issue on accessing information in spoken audio
A system for the retrieval of Italian broadcast news

Speech Communication - Special issue on accessing information in spoken audio
Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese

Speech Communication - Special issue on accessing information in spoken audio
Experiments in spoken document retrieval using phoneme n-grams

Speech Communication - Special issue on accessing information in spoken audio
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Pattern Recognition in Speech and Language Processing

Pattern Recognition in Speech and Language Processing
Optimal Mixture Models in IR

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002

ACM SIGIR Forum
Information fusion for spoken document retrieval

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 04

Exploring the use of latent topical information for statistical Chinese spoken document retrieval

Pattern Recognition Letters
Extractive spoken document summarization for information retrieval

Pattern Recognition Letters
A lattice-based approach to query-by-example spoken document retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Word Topic Models for Spoken Document Retrieval and Transcription

ACM Transactions on Asian Language Information Processing (TALIP)
Topic modeling for spoken document retrieval using word- and syllable-level information

SSCS '09 Proceedings of the third workshop on Searching spontaneous conversational speech
Statistical lattice-based spoken document retrieval

ACM Transactions on Information Systems (TOIS)
Extractive chinese spoken document summarization using probabilistic ranking models

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, statistical modeling approaches have steadily gained in popularity in the field of information retrieval. This article presents an HMM/N-gram-based retrieval approach for Mandarin spoken documents. The underlying characteristics and the various structures of this approach were extensively investigated and analyzed. The retrieval capabilities were verified by tests with word- and syllable-level indexing features and comparisons to the conventional vector-space model approach. To further improve the discrimination capabilities of the HMMs, both the expectation-maximization (EM) and minimum classification error (MCE) training algorithms were introduced in training. Fusion of information via indexing word- and syllable-level features was also investigated. The spoken document retrieval experiments were performed on the Topic Detection and Tracking Corpora (TDT-2 and TDT-3). Very encouraging retrieval performance was obtained.