Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions

  • Authors:
  • Atsuhiro Kojima;Takeshi Tamura;Kunio Fukunaga

  • Affiliations:
  • Library and Science Information Center, Osaka Prefecture University, 1-1 Gakuen-cho, Sakai, Osaka 599-8531, Japan. ark@center.osakafu-u.ac.jp;Library and Science Information Center, Osaka Prefecture University, 1-1 Gakuen-cho, Sakai, Osaka 599-8531, Japan;Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuen-cho, Sakai, Osaka 599-8531, Japan

  • Venue:
  • International Journal of Computer Vision
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a method for describing human activities from video images based on concept hierarchies of actions. Major difficulty in transforming video images into textual descriptions is how to bridge a semantic gap between them, which is also known as inverse Hollywood problem. In general, the concepts of events or actions of human can be classified by semantic primitives. By associating these concepts with the semantic features extracted from video images, appropriate syntactic components such as verbs, objects, etc. are determined and then translated into natural language sentences. We also demonstrate the performance of the proposed method by several experiments.