Fundamentals of speech recognition
Fundamentals of speech recognition
Audio Feature Extraction and Analysis for Scene Segmentation and Classification
Journal of VLSI Signal Processing Systems - special issue on multimedia signal processing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A robust audio classification and segmentation method
MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Automatic information extraction from semi-structured Web pages by pattern discovery
Decision Support Systems - Web retrieval and mining
The Segmentation and Classification of Story Boundaries in News Video
Proceedings of the IFIP TC2/WG2.6 Sixth Working Conference on Visual Database Systems: Visual and Multimedia Information Management
Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion
ACM Transactions on Asian Language Information Processing (TALIP)
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Speechbot: an experimental speech-based search engine formultimedia content on the web
IEEE Transactions on Multimedia
Hi-index | 0.00 |
This paper addresses a content management problem in situations where we have a collection of spoken documents in audio stream format in one language and a collection of related text documents in another. In our case, we have a huge digital archive of audio broadcast news in Taiwanese, but its transcriptions are unavailable. Meanwhile, we have a collection of related text-based news stories, but they are written in Chinese characters. Due to the lack of a standard written form for Taiwanese, manual transcription of spoken documents is prohibitively expensive, and automatic transcription by speech recognition is infeasible because of its poor performance for Taiwanese spontaneous speech. We present an approximate solution by aligning Taiwanese spoken documents with related text documents in Mandarin. The idea is to take advantage of the abundance of Mandarin text documents available in our application to compensate for the limitations of speech recognition systems. Experimental results show that even though our speech recognizer for spontaneous Taiwanese performs poorly, our approach still achieve a high (82.5%) alignment accuracy for sufficient for content management.