Annotation of heterogeneous multimedia content using automatic speech recognition

Authors:
Marijn Huijbregts;Roeland Ordelman;Franciska de Jong
Affiliations:
University of Twente, Dept. of Electrical Engineering, Mathematics and Computer Science, Enschede, The Netherlands;University of Twente, Dept. of Electrical Engineering, Mathematics and Computer Science, Enschede, The Netherlands;University of Twente, Dept. of Electrical Engineering, Mathematics and Computer Science, Enschede, The Netherlands
Venue:
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Year:
2007

Citing 6
Cited 20

An optimal algorithm for generating minimal perfect hash functions

Information Processing Letters
An efficient search space representation for large vocabulary continuous speech recognition

Speech Communication
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Automated speech and audio analysis for semantic access to multimedia

SAMT'06 Proceedings of the First international conference on Semantic and Digital Media Technologies
Robust speaker diarization for meetings: ICSI RT06S meetings evaluation system

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
The AMI speaker diarization system for NIST RT06s meeting data

MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction

Access to recorded interviews: A research agenda

Journal on Computing and Cultural Heritage (JOCCH)
Experiments in interactive video search by addition and subtraction

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Balancing thread based navigation for targeted video search

CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Disclosing spoken culture: user interfaces for access to spoken word archives

BCS-HCI '08 Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction - Volume 1
StreetTiVo: Using a P2P XML Database System to Manage Multimedia Data in Your Living Room

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Concept-Based Video Retrieval

Foundations and Trends in Information Retrieval
NLP and the humanities: the revival of an old liaison

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
User variance and its impact on video retrieval benchmarking

Proceedings of the ACM International Conference on Image and Video Retrieval
Supporting aspect-based video browsing: analysis of a user study

Proceedings of the ACM International Conference on Image and Video Retrieval
Learning automatic concept detectors from online video

Computer Vision and Image Understanding
Overview of VideoCLEF 2008: automatic generation of topic-based feeds for dual language audio-visual content

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Overview of VideoCLEF 2009: new perspectives on speech-based multimedia content enrichment

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Exploiting speech recognition transcripts for narrative peak detection in short-form documentaries

CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Robust speech/non-speech classification in heterogeneous multimedia content

Speech Communication
Automatic tagging and geotagging in video collections and communities

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Exploiting result consistency to select query expansions for spoken content retrieval

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
A review on speaker diarization systems and approaches

Speech Communication
Generating web-based corpora for video transcripts categorization

Expert Systems with Applications: An International Journal
Efficient targeted search using a focus and context video browser

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Narrative theme navigation for sitcoms supported by fan-generated scripts

Multimedia Tools and Applications

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper reports on the setup and evaluation of robust speech recognition system parts, geared towards transcript generation for heterogeneous, real-life media collections. The system is deployed for generating speech transcripts for the NIST/TRECVID-2007 test collection, part of a Dutch real-life archive of news-related genres. Performance figures for this type of content are compared to figures for broadcast news test data.