Voting experts: An unsupervised algorithm for segmenting sequences

  • Authors:
  • Paul Cohen;Niall Adams;Brent Heeringa

  • Affiliations:
  • Information Sciences Institute, University of Southern California, CA, USA;Department of Mathematics, Imperial College London, UK;Department of Computer Science, Williams College, MA, USA

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a statistical signature of chunks and an algorithm for finding chunks. While there is no formal definition of chunks, they may be reliably identified as configurations with low internal entropy or unpredictability and high entropy at their boundaries. We show that the log frequency of a chunk is a measure of its internal entropy. The Voting-Experts exploits the signature of chunks to find word boundaries in text from four languages and episode boundaries in the activities of a mobile robot.