Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Self-organized language modeling for speech recognition
Readings in speech recognition
Wordspotting for voice editing and audio indexing
CHI '92 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Marquee: a tool for real-time video logging
CHI '94 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Multi-paragraph segmentation of expository text
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Speech-based retrieval using semantic co-occurrence filtering
HLT '94 Proceedings of the workshop on Human Language Technology
Exploring tabla drumming using rhythmic input
CHI '95 Conference Companion on Human Factors in Computing Systems
An Annotation Engine for Supporting Video Database Population
Multimedia Tools and Applications
Managing organizational hypermedia documents: a meta-information system
Advanced topics in database research vol. 1
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Synchronizing Music and Video of Query Results in Cross-Media Retrieval System
KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
Hi-index | 0.00 |
In this paper, we discuss mixed-media access, an information access paradigm for multimedia data in which the media type of a query may differ from that of the data. The types of media considered in this paper are speech, images of text, and full-length text. Some examples of metadata for mixed-media access are locations of keywords in speech and images, identification of speakers, locations of emphasized regions in speech, and locations of topic boundaries in text. Algorithms for automatically generating this metadata are described, including word spotting, speaker segmentation, emphatic speech detection, and subtopic boundary location. We illustrate queries composed of diverse media types in an example of access to recorded meetings, via speaker and keyword location.