ACM Computing Surveys (CSUR)
Pivot selection techniques for proximity searching in metric spaces
Pattern Recognition Letters
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
A Dynamic Pivot Selection Technique for Similarity Search
SISAP '08 Proceedings of the First International Workshop on Similarity Search and Applications (sisap 2008)
Spatial Selection of Sparse Pivots for Similarity Search in Metric Spaces
SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Faster and Space-Optimal Edit Distance "1" Dictionary
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Analyzing and evaluating query reformulation strategies in web search logs
Proceedings of the 18th ACM conference on Information and knowledge management
A Smart Filtering System for Newly Coined Profanities by Using Approximate String Alignment
CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
Improved fast similarity search in dictionaries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Online spelling correction for query completion
Proceedings of the 20th international conference on World wide web
A fast pivot-based indexing algorithm for metric spaces
Pattern Recognition Letters
Transforming strings to vector spaces using prototype selection
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
An efficient algorithm for generating super condensed neighborhoods
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Pivot selection: Dimension reduction for distance-based indexing
Journal of Discrete Algorithms
Hi-index | 0.00 |
The string similarity search is the problem of finding similar strings in a given database. Throughout computer engineering, this problem has a number of applications, such as spelling correction, spam filters, and information retrieval. Among the various solutions to this problem, we focus on the distance-space transform, which uses well-known multidimensional spatial data structures such as kD-trees and R*-trees for indexing. This maps strings into k-dimensional vectors whose components are the distances from preselected reference objects (called pivots). In this paper, we further develop the distance-space transform into a more general filtering framework. Based on this framework, we also present an alignment-space transform as an extension of the distance-space transform. Through experiments, we demonstrate the search performance of our proposed method with respect to a variety of search range parameters and pivot selection strategies.