Automated speech and audio analysis for semantic access to multimedia

Authors:
Franciska de Jong;Roeland Ordelman;Marijn Huijbregts
Affiliations:
Dept. of Computer Science, University of Twente, Enschede, AE, The Netherlands;Dept. of Computer Science, University of Twente, Enschede, AE, The Netherlands;Dept. of Computer Science, University of Twente, Enschede, AE, The Netherlands
Venue:
SAMT'06 Proceedings of the First international conference on Semantic and Digital Media Technologies
Year:
2006

Citing 3
Cited 7

Automatic content-based retrieval of broadcast news

Proceedings of the third ACM international conference on Multimedia
Effects of out of vocabulary words in spoken document retrieval (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Surface features in video retrieval

AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback

Access to recorded interviews: A research agenda

Journal on Computing and Cultural Heritage (JOCCH)
Smart audio access to multimedia information

AIC'08 Proceedings of the 8th conference on Applied informatics and communications
Disclosing spoken culture: user interfaces for access to spoken word archives

BCS-HCI '08 Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction - Volume 1
Audio interaction with multimedia information

CIMMACS'09 Proceedings of the 8th WSEAS International Conference on Computational intelligence, man-machine systems and cybernetics
Annotation of heterogeneous multimedia content using automatic speech recognition

SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Multi-method audio-based retrieval of multimedia information

WSEAS Transactions on Information Science and Applications
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives.