Speech recognition in the Informedia Digital Video Library: uses and limitations

Authors:
A. G. Hauptmann
Affiliations:
-
Venue:
TAI '95 Proceedings of the Seventh International Conference on Tools with Artificial Intelligence
Year:
1995

Citing 0
Cited 9

Towards robust features for classifying audio in the CueVideo system

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Automatic discovery of salient segments in imperfect speech transcripts

Proceedings of the tenth international conference on Information and knowledge management
Digital Libraries for the Next Millennium: Challenges and Research Directions

Information Systems Frontiers
Semantic Annotation and Indexing of News and Sports Videos

SOFSEM '02 Proceedings of the 29th Conference on Current Trends in Theory and Practice of Informatics: Theory and Practice of Informatics
Multimedia applications

Handbook of data mining and knowledge discovery
Retrieval effectiveness of an ontology-based model for information selection

The VLDB Journal — The International Journal on Very Large Data Bases
Toward speech as a knowledge resource

IBM Systems Journal
A Framework for Effective Annotation of Information from Closed Captions Using Ontologies

Journal of Intelligent Information Systems
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In principle, speech recognition technology can make any spoken data useful for library indexing and retrieval. The paper describes the Informedia Digital Video Library project and discusses how speech recognition is used for transcript creation from video, alignment with hand-generated transcripts, query interface and audio paragraph segmentation. The results show that speech recognition accuracy varies dramatically depending on the quality and type of data used. Our information retrieval experiments also show that reasonable recall and precision can be obtained with moderate speech recognition accuracy. Finally we discuss some active areas of speech research relevant to the digital video library problem.