IEEE Transactions on Pattern Analysis and Machine Intelligence
Provably sensitive Indexing strategies for biosequence similarity search
Proceedings of the sixth annual international conference on Computational biology
Database indexing for large DNA and protein sequence collections
The VLDB Journal — The International Journal on Very Large Data Bases
A Master-Slave Approach to Parallel Term Rewriting on a Hierarchical Multiprocessor
DISCO '96 Proceedings of the International Symposium on Design and Implementation of Symbolic Computation Systems
A Metric Index for Approximate String Matching
LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Prefix tree indexing for similarity search and similarity joins on genomic data
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Finding the Nearest Neighbors in Biological Databases Using Less Distance Computations
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Clarifying and compiling C/C++ concurrency: from C++11 to POWER
POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Efficient similarity search in very large string sets
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
The string similarity search is an important research area. It enables applications to accept input errors and to detect similarities between strings. This kind of search contains the string similarity search problem. The time to solve this problem depends on the number, the length and the size of the alphabet of the data to search. It is possible to divide the data in data of natural language and data of non-natural language. In data of natural language, this paper analyzes a set of names of cities all over the world. For non-natural language data the paper uses reads from human genome. This paper wants to analyze, if it is possible to outperform an index-based search by a sequential search algorithm. The evaluation shows, that the index-based search has a higher performance on the human genome reads, but not on the geographical names.