ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Description and performance analysis of signature file methods for office filing
ACM Transactions on Information Systems (TOIS)
Implementing ranking strategies using text signatures
ACM Transactions on Information Systems (TOIS)
Fast approximate string matching
Software—Practice & Experience
ACM Computing Surveys (CSUR)
Dynamic partitioning of signature files
ACM Transactions on Information Systems (TOIS)
Information retrieval
Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Dynamic crossword slot table implementation
SAC '92 Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing: technological challenges of the 1990's
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Basic blocks in unconstrained crossword puzzles
SAC '93 Proceedings of the 1993 ACM/SIGAPP symposium on Applied computing: states of the art and practice
Trigrams as index element in full text retrieval: observations and experimental results
CSC '93 Proceedings of the 1993 ACM conference on Computer science
Finding approximate matches in large lexicons
Software—Practice & Experience
Document ranking on weight-partitioned signature files
ACM Transactions on Information Systems (TOIS)
Building a digital library: the Perseus project as a case study in the humanities
Proceedings of the first ACM international conference on Digital libraries
The role of lexicons in natural language processing
Communications of the ACM
Guidelines for presentation and comparison of indexing techniques
ACM SIGMOD Record
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Partial evaluation of queries for bit-sliced signature files
Information Processing Letters
Vertical framing of superimposed signature files using partial evaluation of queries
Information Processing and Management: an International Journal
Recursive hashing functions for n-grams
ACM Transactions on Information Systems (TOIS)
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Small memory software: patterns for systems with limited memory
Small memory software: patterns for systems with limited memory
Improved methods for signature-tree construction
The Computer Journal
Document ranking for variable bit-block compression signatures
Information Processing and Management: an International Journal
Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Efficient Web form entry on PDAs
Proceedings of the 10th international conference on World Wide Web
Efficient web browsing on handheld devices using page and form summarization
ACM Transactions on Information Systems (TOIS)
NR-grep: a fast and flexible pattern-matching tool
Software—Practice & Experience
Palm Programming: The Developer's Guide with CD-ROM
Palm Programming: The Developer's Guide with CD-ROM
Modern Information Retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Efficient Signature File Methods for Text Retrieval
IEEE Transactions on Knowledge and Data Engineering
Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
COCA filters: co-occurrence aware bloom filters
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A novel approach for leveraging co-occurrence to improve the false positive error in signature files
Journal of Discrete Algorithms
Hi-index | 0.00 |
Signature files and inverted files are well-known index structures. In this paper we undertake a direct comparision of the two for searching for partially-specified queries in a large lexicon stored in main memory. Using n-grams to index lexicon terms, a bit-sliced signature file can be compressed to a smaller size than an inverted file if each n-gram sets only one bit in the term signature. With a signature width less than half the number of unique n-grams in the lexicon, the signature file method is about as fast as the inverted file method, and significantly smaller. Greater flexibility in memory usage and faster index generation time make signature files appropriate for searching large lexicons or other collections in an environment where memory is at a premium.