FixTag: An algorithm for identifying and tagging fixations to simplify the analysis of data collected by portable eye trackers

  • Authors:
  • Susan M. Munn;Jeff B. Pelz

  • Affiliations:
  • Rochester Institute of Technology, Rochester, NY;Rochester Institute of Technology, Rochester, NY

  • Venue:
  • ACM Transactions on Applied Perception (TAP)
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Video-based eye trackers produce an output video showing where a subject is looking, the subject's Point-of-Regard (POR), for each frame of a video of the scene. This information can be extremely valuable, but its analysis can be overwhelming. Analysis of eye-tracked data from portable (wearable) eye trackers is especially daunting, as the scene video may be constantly changing, rendering automatic analysis more difficult. A common way to begin analysis of POR data is to group these data into fixations. In a previous article, we compared the fixations identified (i.e., start and end marked) automatically by an algorithm to those identified manually by users (i.e., manual coders). Here, we extend this automatic identification of fixations to tagging each fixation to a Region-of-Interest (ROI). Our fixation tagging algorithm, FixTag, requires the relative 3D positions of the vertices of ROIs and calibration of the scene camera. Fixation tagging is performed by first calculating the camera projection matrices for keyframes of the scene video (captured by the eye tracker) via an iterative structure and motion recovery algorithm. These matrices are then used to project 3D ROI vertices into the keyframes. A POR for each fixation is matched to a point in the closest keyframe, which is then checked against the 2D projected ROI vertices for tagging. Our fixation tags were compared to those produced by three manual coders tagging the automatically identified fixations for two different scenarios. For each scenario, eight ROIs were defined along with the 3D positions of eight calibration points. Therefore, 17 tags were available for each fixation: 8 for ROIs, 8 for calibration points, and 1 for “other.” For the first scenario, a subject was tracked looking through products on four store shelves, resulting in 182 automatically identified fixations. Our automatic tagging algorithm produced tags that matched those produced by at least one manual coder for 181 out of the 182 fixations (99.5% agreement). For the second scenario, a subject was tracked looking at two posters on adjoining walls of a room. Our algorithm matched at least one manual coder's tag for 169 fixations out of 172 automatically identified (98.3% agreement).