Algorithms for approximate string matching
Information and Control
An Efficient Digital Search Algorithm by Using a Double-Array Structure
IEEE Transactions on Software Engineering
The String-to-String Correction Problem
Journal of the ACM (JACM)
A hash code method for detecting and correcting spelling errors
Communications of the ACM
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Efficient algorithms for document retrieval problems
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Approximate String Joins in a Database (Almost) for Free
Proceedings of the 27th International Conference on Very Large Data Bases
SOCQET: semantic OLAP with compressed cube and summarization
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Type less, find more: fast autocompletion search with a succinct index
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving search engines by query clustering
Journal of the American Society for Information Science and Technology
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Introduction to Information Retrieval
Introduction to Information Retrieval
Efficient interactive fuzzy keyword search
Proceedings of the 18th international conference on World wide web
Web Query Recommendation via Sequential Query Prediction
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Extending autocompletion to tolerate errors
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Efficient approximate entity extraction with edit distance constraints
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Large scale query log analysis of re-finding
Proceedings of the third ACM international conference on Web search and data mining
CIAA'02 Proceedings of the 7th international conference on Implementation and application of automata
Clustering query refinements by user intent
Proceedings of the 19th international conference on World wide web
Fast index for approximate string matching
Journal of Discrete Algorithms
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
Context-sensitive query auto-completion
Proceedings of the 20th international conference on World wide web
Online spelling correction for query completion
Proceedings of the 20th international conference on World wide web
Efficient exact edit similarity query processing with the asymmetric signature scheme
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient fuzzy full-text type-ahead search
The VLDB Journal — The International Journal on Very Large Data Bases
Pass-join: a partition-based method for similarity joins
Proceedings of the VLDB Endowment
An Efficient Trie-based Method for Approximate Entity Extraction with Edit-Distance Constraints
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Hi-index | 0.00 |
Query autocompletion is an important feature saving users many keystrokes from typing the entire query. In this paper we study the problem of query autocompletion that tolerates errors in users' input using edit distance constraints. Previous approaches index data strings in a trie, and continuously maintain all the prefixes of data strings whose edit distance from the query are within the threshold. The major inherent problem is that the number of such prefixes is huge for the first few characters of the query and is exponential in the alphabet size. This results in slow query response even if the entire query approximately matches only few prefixes. In this paper, we propose a novel neighborhood generation-based algorithm, IncNGTrie, which can achieve up to two orders of magnitude speedup over existing methods for the error-tolerant query autocompletion problem. Our proposed algorithm only maintains a small set of active nodes, thus saving both space and time to process the query. We also study efficient duplicate removal which is a core problem in fetching query answers. In addition, we propose optimization techniques to reduce our index size, as well as discussions on several extensions to our method. The efficiency of our method is demonstrated against existing methods through extensive experiments on real datasets.