In situ generation of compressed inverted files
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
On two-dimensional indexability and optimal range search indexing
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Suffix arrays: a new method for on-line string searches
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
In-memory hash tables for accumulating text vocabularies
Information Processing Letters
Performance of data structures for small sets of strings
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Introduction to Parallel Computing
Introduction to Parallel Computing
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Two-dimensional substring indexing
Journal of Computer and System Sciences - Special issu on PODS 2001
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Journal of the ACM (JACM)
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Type less, find more: fast autocompletion search with a succinct index
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ESTER: efficient search on text, entities, and relations
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Hybrid index maintenance for contiguous inverted lists
Information Retrieval
Efficient online index construction for text databases
ACM Transactions on Database Systems (TODS)
Fast error-tolerant search on very large texts
Proceedings of the 2009 ACM symposium on Applied Computing
Fast Single-Pass Construction of a Half-Inverted Index
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Efficient two-sided error-tolerant search
Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
Recent and robust query auto-completion
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
As shown in a series of recent works, the HYB index is an alternative to the inverted index (INV) that enables very fast prefix searches, which in turn is the basis for fast processing of many other types of advanced queries, including autocompletion, faceted search, error-tolerant search, database-style select and join, and semantic search. In this work we show that HYB can be constructed at least as fast as INV, and often up to twice as fast. This is because HYB, by its nature, requires only a half-inversion of the data and allows an efficient in-place instead of the traditional merge-based index construction. We also pay particular attention to the cache efficiency of the in-memory posting accumulation, an issue that has not been addressed in previous work, and show that our simple multilevel posting accumulation scheme yields much fewer cache misses compared to related approaches. Finally, we show that HYB supports fast dynamic index updates more easily than INV.