Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
Automatic content-based retrieval of broadcast news
Proceedings of the third ACM international conference on Multimedia
Communications of the ACM
The LIMSI Broadcast News transcription system
Speech Communication - Special issue on automatic transcription of broadcast news data
Name-It: Naming and Detecting Faces in News Videos
IEEE MultiMedia
Collages as dynamic summaries for news video
Proceedings of the tenth ACM international conference on Multimedia
Multimodal concept-dependent active learning for image retrieval
Proceedings of the 12th annual ACM international conference on Multimedia
Automatic video annotation using ontologies extended with visual information
Proceedings of the 13th annual ACM international conference on Multimedia
Building a visual ontology for video retrieval
Proceedings of the 13th annual ACM international conference on Multimedia
Live sports event detection based on broadcast video and web-casting text
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
VAST MM: multimedia browser for presentation video
Proceedings of the 6th ACM international conference on Image and video retrieval
IBM multimedia search and retrieval system
Proceedings of the 6th ACM international conference on Image and video retrieval
Evaluation of active learning strategies for video indexing
Image Communication
Query on demand video browsing
Proceedings of the 15th international conference on Multimedia
Towards the next plateau: innovative multimedia research beyond trecvid
Proceedings of the 15th international conference on Multimedia
Experiments in interactive video search by addition and subtraction
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
The ICSI RT07s Speaker Diarization System
Multimodal Technologies for Perception of Humans
Design of Multimodal Dissimilarity Spaces for Retrieval of Video Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Live speaker identification in conversations
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Foundations and Trends in Information Retrieval
Visual speaker localization aided by acoustic models
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Joke-o-mat: browsing sitcoms punchline by punchline
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Annotation of heterogeneous multimedia content using automatic speech recognition
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Joke-o-Mat HD: browsing sitcoms with human derived transcripts
Proceedings of the international conference on Multimedia
Narrative theme navigation for sitcoms supported by fan-generated scripts
Proceedings of the 3rd international workshop on Automated information extraction in media production
A fully automated content-based video search engine supporting spatiotemporal queries
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
The following article provides the definitive description of the complete Joke-O-Mat system to navigate sitcoms as presented briefly in Friedland et al. (2009) and extended in Janin et al. (2010), which was augmented with fan-generated scripts as described in Friedland et al. (2010). The system with the extension allows a user to browse a sitcom by scene, punchline, and dialog segment, and to filter these themes by actor and by keyword. For example, the user can choose to watch only punchlines by the character "Kramer" that contain the word "armoire". The system infers the narrative themes and provides word-level search by automatically aligning the output of a speaker identification system and a speech recognizer to both closed captions and scripts generated by fans on the Internet. The segmentations produced by this system have proven to be indistinguishable from expert-generated segmentations, and require significantly less time to produce. The article describes the original and the extended Joke-O-Mat ( http://www.icsi.berkeley.edu/jokeomat/ ) system, discusses problems with the use of fan-generated content, and presents results on episodes from the sitcom Seinfeld with regards to segmentation accuracy and overall user satisfaction as determined by a human-subject study.