Efficient bitmap-based indexing of time-based interval sequences

  • Authors:
  • Jong-won Roh;Seung-won Hwang;Byoung-Kee Yi

  • Affiliations:
  • Department of Computer Science and Engineering, Pohang University of Science and Technology, Republic of Koreaq;Department of Computer Science and Engineering, Pohang University of Science and Technology, Republic of Koreaq;Department of Medical Informatics, Samsung Seoul Hospital, Republic of Korea

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.07

Visualization

Abstract

In this paper, we discuss similarity searches for time series data represented as interval sequences. For instance, the time series of phone call records can be represented by time-based interval sequences, or T-interval sequences, which consist of the start and end times of the call records. To support an efficient similarity search for such sequences, we address the desirable semantics for similarity measures for the T-interval sequences, observe how existing measures fail to address such semantics, and propose a new measure that satisfies all our semantics. We then propose approximate encoding methods for T-interval sequences. More specifically, we propose two bitmap-based feature extraction methods: (1) a bin-bitmap encoding method that transforms the T-interval sequences into bitmaps of fixed length, and (2) a segmented feature extraction method that takes the longest bitmap sequences of consecutive '1' elements. Finally, we propose two query processing schemes using these bitmap-based approximate representations. We validate the efficiency and effectiveness of our proposed solutions empirically.