Texture Features for Browsing and Retrieval of Image Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
New enhancements to cut, fade, and dissolve detection processes in video segmentation
MULTIMEDIA '00 Proceedings of the eighth ACM international conference on Multimedia
Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
On Affine Invariant Clustering and Automatic Cast Listing in Movies
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Detecting Pedestrians Using Patterns of Motion and Appearance
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Robust Real-Time Face Detection
International Journal of Computer Vision
Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images
Journal of VLSI Signal Processing Systems
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Automatic Face Recognition for Film Character Retrieval in Feature-Length Films
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
On the Use of SIFT Features for Face Authentication
CVPRW '06 Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop
Segregation of speakers for speech recognition and speaker identification
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A fusion study in speech / music classification
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Audio Segmentation and Speaker Localization in Meeting Videos
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Major cast detection in video using both audio and visual information
ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
Pose Robust Face Tracking by Combining Active Appearance Models and Cylinder Head Models
International Journal of Computer Vision
Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, MD, USA, May 8-11, 2007, Revised Selected Papers
Multi-stage Speaker Diarization for Conference and Lecture Meetings
Multimodal Technologies for Perception of Humans
Automatic Classification Video for Person Indexing
CISP '08 Proceedings of the 2008 Congress on Image and Signal Processing, Vol. 2 - Volume 02
Taking the bite out of automated naming of characters in TV video
Image and Vision Computing
Tracking and Retexturing Cloth for Real-Time Virtual Clothing Applications
MIRAGE '09 Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques
Multi-modal speaker diarization of real-world meetings using compressed-domain video features
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Improved speaker diarization system for meetings
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Visual language model for face clustering in consumer photos
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Face-and-clothing based people clustering in video content
Proceedings of the international conference on Multimedia information retrieval
Tracking multiple people with recovery from partial and total occlusion
Pattern Recognition
Video shot boundary detection: Seven years of TRECVid activity
Computer Vision and Image Understanding
Speaker localisation using audio-visual synchrony: an empirical study
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Dialocalization: Acoustic speaker diarization and visual localization as joint optimization problem
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Major Cast Detection in Video Using Both Speaker and Face Information
IEEE Transactions on Multimedia
Unsupervised metric learning for face identification in TV video
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Hi-index | 0.00 |
Audio-Visual People Diarization (AVPD) is an original framework that simultaneously improves audio, video, and audiovisual diarization results. Following a literature review of people diarization for both audio and video content and their limitations, which includes our own contributions, we describe a proposed method for associating both audio and video information by using co-occurrence matrices and present experiments which were conducted on a corpus containing TV news, TV debates, and movies. Results show the effectiveness of the overall diarization system and confirm the gains audio information can bring to video indexing and vice versa.