Metadata for mixed-media access

Authors:
Francine Chen;Marti Hearst;Julian Kupiec;Jan Pedersen;Lynn Wilcox
Affiliations:
Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA;Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA;Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA;Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA;Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA
Venue:
ACM SIGMOD Record
Year:
1994

Citing 8
Cited 5

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Self-organized language modeling for speech recognition

Readings in speech recognition
An Introduction to Speech and Speaker Recognition

Computer
Wordspotting for voice editing and audio indexing

CHI '92 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Marquee: a tool for real-time video logging

CHI '94 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Speech-based retrieval using semantic co-occurrence filtering

HLT '94 Proceedings of the workshop on Human Language Technology

Exploring tabla drumming using rhythmic input

CHI '95 Conference Companion on Human Factors in Computing Systems
An Annotation Engine for Supporting Video Database Population

Multimedia Tools and Applications
Managing organizational hypermedia documents: a meta-information system

Advanced topics in database research vol. 1
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Synchronizing Music and Video of Query Results in Cross-Media Retrieval System

KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we discuss mixed-media access, an information access paradigm for multimedia data in which the media type of a query may differ from that of the data. The types of media considered in this paper are speech, images of text, and full-length text. Some examples of metadata for mixed-media access are locations of keywords in speech and images, identification of speakers, locations of emphasized regions in speech, and locations of topic boundaries in text. Algorithms for automatically generating this metadata are described, including word spotting, speaker segmentation, emphatic speech detection, and subtopic boundary location. We illustrate queries composed of diverse media types in an example of access to recorded meetings, via speaker and keyword location.