Spatiotemporal segmentation and tracking of objects for visualization of videoconference image sequences

  • Authors:
  • I. Kompatsiaris;M. Gerassimos Strintz

  • Affiliations:
  • Inf. Processing Lab., Aristotelian Univ. of Thessaloniki;-

  • Venue:
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

A procedure is described for the segmentation, content-based coding, and visualization of videoconference image sequences. First, image sequence analysis is used to estimate the shape and motion parameters of the person facing the camera. A spatiotemporal filter, taking into account the intensity differences between consequent frames, is applied, in order to separate the moving person from the static background. The foreground is segmented in a number of regions in order to identify the face. For this purpose, we propose the novel procedure of K-means with connectivity constraint algorithm as a general segmentation algorithm combining several types of information including intensity, motion and compactness. In this algorithm, the use of spatiotemporal regions is introduced since a number of frames are analyzed simultaneously and as a result, the same region is present in consequent frames. Based on this information, a 3-D ellipsoid is adapted to the person's face using an efficient and robust algorithm. The rigid 3-D motion is estimated next using a least median of squares approach, Finally, a virtual reality modeling language (VRML) file is created containing all the above information; this file may be viewed by using any VRML 2.0 compliant browser