Semantic video annotation by mining association patterns from visual and speech features
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Hi-index | 0.00 |
In the area of multimedia processing, a number of studies have been devoted to narrowing the gap between multimedia content and human sense. In fact, multimedia understanding is a difficult and challenging task even using machine-learning techniques. To deal with this challenge, in this paper, we propose an innovative method that employs data mining techniques and content-based paradigm to conceptualize videos. Mainly, our proposed method puts the focus on: (1) Construction of prediction models, namely speech-association model ModelSass and visual-statistical model ModelCRM, and (2) Fusion of prediction models to annotate unknown videos automatically. Without additional manual cost, discovered speech-association patterns can show the implicit relationships among the sequential images. On the other hand, visual features can atone for the inadequacy of speech-association patterns. Empirical evaluations reveal that our approach makes, on the average, the promising results than other methods for annotating videos.