Relational Information Systems
Relational Information Systems
New indices for text: PAT Trees and PAT arrays
Information retrieval
Information retrieval
A new approach to text searching
Communications of the ACM
Fast text searching: allowing errors
Communications of the ACM
An approximate string-matching algorithm
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
String searching algorithms
Trie methods for text and spatial data on secondary storage
Trie methods for text and spatial data on secondary storage
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
The String-to-String Correction Problem
Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
A fast string searching algorithm
Communications of the ACM
A technique for computer detection and correction of spelling errors
Communications of the ACM
Trie Methods for Representing Text
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Text Searching of Regular Expressions (Extended Abstract)
ICALP '89 Proceedings of the 16th International Colloquium on Automata, Languages and Programming
Fast and Practical Approximate String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Advanced grouping and aggregation for data integration
Proceedings of the tenth international conference on Information and knowledge management
Time-Space Trade-Off Analysis of Morphic Trie Images
IEEE Transactions on Knowledge and Data Engineering
Matchsimile: a flexible approximate matching tool for searching proper names
Journal of the American Society for Information Science and Technology
Efficient similarity-based operations for data integration
Data & Knowledge Engineering
Indexing mixed types for approximate retrieval
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient approach for sequence matching in large DNA databases
Journal of Information Science
A dictionary for approximate string search and longest prefix search
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
High-error approximate dictionary search using estimate hash comparisons
Software—Practice & Experience
Compacting music signatures for efficient music retrieval
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Proceedings of the ACM International Conference on Image and Video Retrieval
Using similarity-based operations for resolving data-level conflicts
BNCOD'03 Proceedings of the 20th British national conference on Databases
Prefix tree indexing for similarity search and similarity joins on genomic data
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Exploiting available memory and disk for scalable instant overview search
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Enhancing trie-based syntactic pattern recognition using AI heuristic search strategies
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
A novel indexing method for efficient sequence matching in large DNA database environment
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Efficient approximate dictionary look-up for long words over small alphabets
LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
Scalable sequence similarity search and join in main memory on multi-cores
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Efficient similarity search in very large string sets
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
A syntactic PR approach to Telugu handwritten character recognition
Proceeding of the workshop on Document Analysis and Recognition
Hi-index | 0.00 |
Tries offer text searches with costs which are independent of the size of the document being searched, and so are important for large documents requiring spelling checkers, case insensitivity, and limited approximate regular secondary storage. Approximate searches, in which the search pattern differs from the document by k substitutions, transpositions, insertions or deletions, have hitherto been carried out only at costs linear in the size of the document. We present a trie-based method whose cost is independent of document size. Our experiments show that this new method significantly outperforms the nearest competitor for k = 0 and k = 1, which are arguably the most important cases. The linear cost (in k) of the other methods begins to catch up, for our small files, only at k = 2. For larger files, complexity arguments indicate that tries will outperform the linear methods for larger values of k. Trie indexes combine suffixes and so are compact in storage. When the text itself does not need to be stored, as in a spelling checker, we even obtain negative overhead: 50% compression. We discuss a variety of applications and extensions, including best match (for spelling checkers), case insensitivity, and limited approximate regular expression matching.