Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs

Authors:
Hung-Yi Lee; Lin-Shan Lee
Affiliations:
Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan;Dept. of Electr. Eng., Nat. Taiwan Univ., Taipei, Taiwan
Venue:
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Year:
2014

Citing 25
Cited 0

A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Subword-based approaches for spoken document retrieval

Subword-based approaches for spoken document retrieval
A Survey of Eigenvector Methods for Web Information Retrieval

SIAM Review
Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Regularized estimation of mixture models for robust pseudo-relevance feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Position specific posterior lattices for indexing speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Soft indexing of speech content for search in spoken documents

Computer Speech and Language
Latent concept expansion using markov random fields

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Indexing confusion networks for morph-based spoken document retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval

Introduction to Information Retrieval
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
Spoken information retrieval for turkish broadcast news

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning on demand - course lecture distillation by information extraction and semantic structuring for spoken documents

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An audio indexing system for election video material

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
General indexation of weighted automata: application to spoken utterance retrieval

SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
A comparative study of methods for estimating query language models with pseudo feedback

Proceedings of the 18th ACM conference on Information and knowledge management
Statistical lattice-based spoken document retrieval

ACM Transactions on Information Systems (TOIS)
Latent semantic indexing (LSI) fails for TREC collections

ACM SIGKDD Explorations Newsletter
Regularized latent semantic indexing

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Approaches to reduce the effects of OOV queries on indexed spoken audio

IEEE Transactions on Multimedia
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a text context, document/query expansion has proven very useful in retrieving objects semantically related to the query. However, when applying text-based techniques on spoken content, the inevitable recognition errors seriously degrade performance even when the retrieval process is performed over lattices. We propose the estimation of more accurate term distributions (or unigram language models) for the spoken documents by acoustic similarity graphs. In this approach, a graph is constructed for each term describing the acoustic similarity among all signal regions hypothesized to be the considered term. Score propagation based on a random walk over the graph offers more reliable scores of the term hypotheses, which in turn yield more accurate term distributions (or unigram language models). This approach was applied with the language modeling retrieval approach, including using document expansion based on latent topic analysis and query expansion with a query-regularized mixture model. We extend these approaches from words to subword n-grams, and the query expansion from document-level to utterance-level and from term-based to topic-based. Experiments performed on Mandarin broadcast news showed improved performance under almost all tested conditions.