An efficient subsequence matching method based on index interpolation

  • Authors:
  • Hyun-Gil Koh;Woong-Kee Loh;Sang-Wook Kim

  • Affiliations:
  • Department of Information and Communication Engineering, Kangwon National University, Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology, Korea;College of Information and Communications, Hanyang University, Korea

  • Venue:
  • IEA/AIE'2005 Proceedings of the 18th international conference on Innovations in Applied Artificial Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Subsequence matching is one of the most important issues in the field of data mining. The existing subsequence matching algorithms use windows of the fixed size to construct only one index. The algorithms have a problem that their performance gets worse as the difference between the query sequence length and the window size increases. In this paper, we propose a new subsequence matching method based on index interpolation, which is a technique that constructs the indexes for multiple window sizes and chooses an index most appropriate for a given query sequence for subsequence matching. We first examine the performance change due to the window size effect through preliminary experiments, and devise a cost function for subsequence matching that reflects the distribution of query sequence lengths in the view point of physical database design. Next, we propose a new subsequence matching method to improve search performance, and present an algorithm based on the cost function to construct the multiple indexes to maximize the performance. Finally, we verify the superiority of the proposed method through a series of experiments using the real and the synthetic data sequences.