In situ generation of compressed inverted files
Journal of the American Society for Information Science
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Performance of data structures for small sets of strings
ACSC '02 Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Type less, find more: fast autocompletion search with a succinct index
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Fast construction of the HYB index
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.01 |
We show how a half-inverted index can be constructed twice as fast as an ordinary inverted index. As shown in a series of recent works, the half-inverted index enables very fast prefix search, which in turn is the basis for very fast processing of many other types of advanced queries. Our construction algorithm is truly single-pass in that every posting (word occurrence) is touched (read and written) only once in the whole construction by avoiding an expensive merge of the index. The algorithm has been carefully engineered, with special attention paid to cache-efficiency and disk cost. We compared our algorithm against the state-of-the-art index construction from Zettair.