From multimedia data to situation detection

  • Authors:
  • Vivek K. Singh

  • Affiliations:
  • University of California, Irvine, Irvine, CA, USA

  • Venue:
  • MM '11 Proceedings of the 19th ACM international conference on Multimedia
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are witnessing a phenomenal increase in multimodal human and device sensing to measure and report parameters such as temperature, vehicle speed, visual experiences, flu cases, and people happiness. Soon we expect these heterogeneous datasets (e.g. images, videos, weather sensors, check-ins and tweets) to become available in real-time in the Cloud for reasoning and decision making. Important human decisions however cannot be undertaken by piecemeal treatment of these individual data points. Rather we need computational tools to integrate and abstract these data points into higher level actionable representations. This underscores the need for computational tools to model and detect situations from large heterogeneous spatio-temporal data sets. In this thesis, we computationally define the notion of situations and propose a methodology to bridge the semantic gap between the widely available low level spatio-temporal data and actionable situation inference needed for decision making. We define a situation as: "An actionable abstraction of observed spatio-temporal descriptors". This definition underscores our viewpoint of computationally defining situations based on statistical descriptors (as opposed to say situation-calculus or recognition-by-parts), a focus on spatio-temporal data (which is indeed the most common connotation associated with situations), scoping of problem only to observable (via human/device sensors) data, and a focus on actionable abstractions (as defined explicitly by human domain experts). The problem of modeling and detecting situations from (STT) i.e. spatio-temporal-thematic data is relevant in multiple domains like traffic, weather, healthcare, business analysis, emergency response, and political decision making. Situation Modeling STT data spreads across very disparate application domains as well as data types. However, focusing on the commonalities and not the differences, we realize that there is a core set of operations which is central to defining spatio-temporal situations across different applications. Once a domain expert defines a situation of interest (e.g. a 'flu pandemic', 'hurricane advise') based on data sources, core operations, and user parameters, the same situation model can act as a standing query on realtime data streams and provide 'mass-personalization' to billions of end-users. Just like E/R modeling, or UML we merely provide the basic building blocks. It is each domain expert's responsibility to define actionable situations by combining these building blocks. These building blocks are designed to be computable, modular and explicit and hence translatable into executable code once the modeling is complete. Approach Our approach for integrating and characterizing heterogeneous spatio-temporal data is based on the concept of social pixels. We simply organize spatio-temporal values related to any theme on a two dimensional data grid. Such a grid provides heat-map like intuitive visualization, and also an image like computational data structure. Hence multiple spatio-temporal situational descriptors can be implemented as off-the-shelf image and video processing operations (Refer Fig 1). Current status We have made progress in terms of identifying the generic set of situation detection operations [1]. We have run multiple experiments with STT data sets to answer situational queries like 'what recommendation to give to user indicating flu-like symptoms' [2] and 'where to open a new iphone store'[1]. We are currently implementing the core STT analysis engine which will allow modeling and detection of multiple situation queries across applications. We are also finalizing a methodology to guide domain experts when they model situations in terms of building blocks like data sources, characterizations operators, and user parameters.