The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Position specific posterior lattices for indexing speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Searching the audio notebook: keyword search in recorded conversations
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Analysis and processing of lecture audio data: preliminary investigations
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Soft indexing of speech content for search in spoken documents
Computer Speech and Language
Cross-domain matching for automatic tag extraction across redundant handwriting and speech events
Proceedings of the 2007 workshop on Tagging, mining and retrieval of human related activity information
A lattice-based approach to query-by-example spoken document retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A critical assessment of spoken utterance retrieval through approximate lattice representations
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Instruction production model based on the multimedia learning theory
CATE '07 Proceedings of the 10th IASTED International Conference on Computers and Advanced Technology in Education
Relevant document retrieval using a spoken document
ISCIT'09 Proceedings of the 9th international conference on Communications and information technologies
Performance analysis for lattice-based speech indexing approaches using words and subword units
IEEE Transactions on Audio, Speech, and Language Processing
Supporting collaborative transcription of recorded speech with a 3D game interface
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part IV
ACM Transactions on Speech and Language Processing (TSLP)
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Hi-index | 0.00 |
Large-scale web-search engines are generally designed for linear text. The linear text representation is suboptimal for audio search, where accuracy can be significantly improved if the search includes alternate recognition candidates, commonly represented as word lattices.This paper proposes a method for indexing word lattices that is suitable for large-scale web-search engines, requiring only limited code changes.The proposed method, called Time-based Merging for Indexing (TMI), first converts the word lattice to a posterior-probability representation and then merges word hypotheses with similar time boundaries to reduce the index size. Four alternative approximations are presented, which differ in index size and the strictness of the phrase-matching constraints.Results are presented for three types of typical web audio content, podcasts, video clips, and online lectures, for phrase spotting and relevance ranking. Using TMI indexes that are only five times larger than corresponding linear-text indexes, phrase spotting was improved over searching top-1 transcripts by 25-35%, and relevance ranking by 14%, at only a small loss compared to unindexed lattice search.