Trying to outperform a well-known index with a sequential scan

  • Authors:
  • Jan Hentschel;Thomas Meyer;Thomas Rommel

  • Affiliations:
  • Otto-von-Guericke-University, Magdeburg, Germany;Otto-von-Guericke-University, Magdeburg, Germany;Otto-von-Guericke-University, Magdeburg, Germany

  • Venue:
  • Proceedings of the Joint EDBT/ICDT 2013 Workshops
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The string similarity search is an important research area. It enables applications to accept input errors and to detect similarities between strings. This kind of search contains the string similarity search problem. The time to solve this problem depends on the number, the length and the size of the alphabet of the data to search. It is possible to divide the data in data of natural language and data of non-natural language. In data of natural language, this paper analyzes a set of names of cities all over the world. For non-natural language data the paper uses reads from human genome. This paper wants to analyze, if it is possible to outperform an index-based search by a sequential search algorithm. The evaluation shows, that the index-based search has a higher performance on the human genome reads, but not on the geographical names.