Topic transition detection using hierarchical hidden Markov and semi-Markov models

Authors:
Dinh Q. Phung;T. V. Duong;S. Venkatesh;Hung H. Bui
Affiliations:
Curtin Univesrity of Technology, Perth, Western Australia;Curtin Univesrity of Technology, Perth, Western Australia;Curtin Univesrity of Technology, Perth, Western Australia;SRI International, Menlo Park, CA
Venue:
Proceedings of the 13th annual ACM international conference on Multimedia
Year:
2005

Citing 16
Cited 8

Continuously variable duration hidden Markov models for automatic speech recognition

Computer Speech and Language
Medium knowledge-based macro-segmentation of video into sequences

Intelligent multimedia information retrieval
Neural Network-Based Face Detection

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
High level segmentation of instructional videos based on content density

Proceedings of the tenth ACM international conference on Multimedia
Automatic Scene Detection in News Program by Integrating Visual Feature and Rules

PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Automatic Text Extraction from Video for Content-Based Annotation and Retrieval

ICPR '98 Proceedings of the 14th International Conference on Pattern Recognition-Volume 1 - Volume 1
Segmentation, structure detection and summarization of multimedia sequences

Segmentation, structure detection and summarization of multimedia sequences
Multimodal Video Indexing: A Review of the State-of-the-art

Multimedia Tools and Applications
Activity Recognition and Abnormality Detection with the Switching Hidden Semi-Markov Model

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
New approaches to audio-visual segmentation of TV news for automatic topic retrieval

ICASSP '01 Proceedings of the Acoustics, Speech, and Signal Processing, 2001. on IEEE International Conference - Volume 03
Hierarchical hidden Markov models with general state hierarchy

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Computable scenes and structures in films

IEEE Transactions on Multimedia
Systematic evaluation of logical story unit segmentation

IEEE Transactions on Multimedia
Automated high-level movie segmentation for advanced video-retrieval systems

IEEE Transactions on Circuits and Systems for Video Technology
Shot-boundary detection: unraveled and resolved?

IEEE Transactions on Circuits and Systems for Video Technology

Atomic topical segments detection for instructional videos

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Unsupervised topic identification by integrating linguistic and visual information based on hidden Markov models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Webpage understanding: an integrated approach

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic Development Based Refinement of Audio-Segmented Television News

NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Unsupervised segmentation of hidden semi-Markov non-stationary chains

Signal Processing
Marginalized Viterbi algorithm for hierarchical hidden Markov models

Pattern Recognition
Finding the most likely upper level state sequence for hierarchical HMMs

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Anomaly detection in large-scale data stream networks

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we introduce a probabilistic framework to exploit hierarchy, structure sharing and duration information for topic transition detection in videos. Our probabilistic detection framework is a combination of a shot classification step and a detection phase using hierarchical probabilistic models. We consider two models in this paper: the extended Hierarchical Hidden Markov Model (HHMM) and the Coxian Switching Hidden semi-Markov Model (S-HSMM) because they allow the natural decomposition of semantics in videos, including shared structures, to be modeled directly, and thus enable efficient inference and reduce the sample complexity in learning. Additionally, the S-HSMM allows the duration information to be incorporated, consequently the modeling of long-term dependencies in videos is enriched through both hierarchical and duration modeling. Furthermore, the use of Coxian distribution in the S-HSMM makes it tractable to deal with long sequences in video. Our experimentation of the proposed framework on twelve educational and training videos shows that both models outperform the baseline cases (flat HMM and HSMM) and performances reported in earlier work in topic detection. The superior performance of the S-HSMM over the HHMM verifies our belief that the duration information is an important factor in video content modeling.