Web-based topic language modeling for audio indexing

Authors:
Ken-ichi Iso
Affiliations:
Yahoo! Japan Research, Yahoo Japan Corporation, Tokyo, Japan
Venue:
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Year:
2009

Citing 4
Cited 0

A Cache-Based Natural Language Model for Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Latent dirichlet allocation

The Journal of Machine Learning Research
Monte Carlo Statistical Methods (Springer Texts in Statistics)

Monte Carlo Statistical Methods (Springer Texts in Statistics)
Distributed language modeling for N-best list re-ranking

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe the implementation of a scalable architecture for audio indexing, in which topic-dependent language models (LMs) were trained on web pages categorized in a portal web directory and stored on distributed servers. Input speech was decoded in parallel on servers that each had an individual topic LM. From the decoders' outputs, an optimal hypothesis was chosen for each utterance by a topic-selection criterion minimizing an energy function with three terms: likelihood scores for the utterances; keyword co-occurrence statistics to measure the long-distance correlation; and web-based hypothesis verification scores, which penalize misrecognized trigrams through web search results. Experimental results showed that the proposed approach outperformed the baseline topic-independent system by 6.0% absolutely (20.0% relatively) in character accuracy.