Mining spatio-temporal patterns and knowledge structures in multimedia collection

  • Authors:
  • Shih-Fu Chang

  • Affiliations:
  • Columbia University

  • Venue:
  • MMDB '03 Proceedings of the 1st ACM international workshop on Multimedia databases
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detection and recognition of semantic events has been a major research challenge for multimedia indexing. An emerging direction in this field has been unsupervised discovery (mining) of patterns in spatial-temporal multimedia data. Patterns are recurrent, predictable occurrences of one or more entities that satisfy associative, statistical, or relational conditions. Patterns at the feature level may signify the occurrence of events (e.g., passing pedestrians). At the event level, patterns may represent multi-event transitions, e.g., play-break alternations in sports. Patterns in an annotated image collection may indicate collocations of related semantic concepts and perceptual clusters.Mining of patterns of different types at different levels offers rich benefits, including automatic discovery of salient events in a new domain, automatic alert generation from massive real-time data (such as surveillance data in a new environment), and discovery of novel event relationships.Many challenging issues emerge. What are the adequate representations and statistical models for patterns that may exist at different levels and different time scales? How to handle patterns that may have relatively sparse occurring frequencies? How do we evaluate the accuracy and quality of mining results given its unsupervised nature?In this talk, we will present results of our preliminary attempts in mining patterns in structured video sequences (such as sports and surveillance video) and large annotated image collections. Specifically, we will discuss the potential of statistical models like Hierarchical HMM for video mining, and the integrative exploration of electronic knowledge (such as WordNet) and statistical clustering for image knowledge mining.