Towards spoken-document retrieval for the internet: lattice indexing for large-scale web-search architectures

Authors:
Zheng-Yu Zhou;Peng Yu;Ciprian Chelba;Frank Seide
Affiliations:
Chinese University of Hong Kong, Shatin, Hong Kong;Microsoft Research Asia, Beijing;Microsoft Research, Redmond, WA;Microsoft Research Asia, Beijing
Venue:
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Year:
2006

Citing 4
Cited 10

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Position specific posterior lattices for indexing speech

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Searching the audio notebook: keyword search in recorded conversations

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Analysis and processing of lecture audio data: preliminary investigations

SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004

Soft indexing of speech content for search in spoken documents

Computer Speech and Language
Cross-domain matching for automatic tag extraction across redundant handwriting and speech events

Proceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information
A lattice-based approach to query-by-example spoken document retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A critical assessment of spoken utterance retrieval through approximate lattice representations

MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Instruction production model based on the multimedia learning theory

CATE '07 Proceedings of the 10th IASTED International Conference on Computers and Advanced Technology in Education
Relevant document retrieval using a spoken document

ISCIT'09 Proceedings of the 9th international conference on Communications and information technologies
Performance analysis for lattice-based speech indexing approaches using words and subword units

IEEE Transactions on Audio, Speech, and Language Processing
Supporting collaborative transcription of recorded speech with a 3D game interface

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part IV
Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

ACM Transactions on Speech and Language Processing (TSLP)
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-scale web-search engines are generally designed for linear text. The linear text representation is suboptimal for audio search, where accuracy can be significantly improved if the search includes alternate recognition candidates, commonly represented as word lattices.This paper proposes a method for indexing word lattices that is suitable for large-scale web-search engines, requiring only limited code changes.The proposed method, called Time-based Merging for Indexing (TMI), first converts the word lattice to a posterior-probability representation and then merges word hypotheses with similar time boundaries to reduce the index size. Four alternative approximations are presented, which differ in index size and the strictness of the phrase-matching constraints.Results are presented for three types of typical web audio content, podcasts, video clips, and online lectures, for phrase spotting and relevance ranking. Using TMI indexes that are only five times larger than corresponding linear-text indexes, phrase spotting was improved over searching top-1 transcripts by 25-35%, and relevance ranking by 14%, at only a small loss compared to unindexed lattice search.