Dynamic data structures: theory and application
Dynamic data structures: theory and application
Synthetic workload performance analysis of incremental updates
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Inverted files versus signature files for text indexing
ACM Transactions on Database Systems (TODS)
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Fast Incremental Indexing for Full-Text Information Retrieval
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The DC-Tree: A Fully Dynamic Index Structure for Data Warehouses
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Feature Reduction and Database Maintenance in NETNEWS Classification
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Efficient online index maintenance for contiguous inverted lists
Information Processing and Management: an International Journal
Efficient in-memory extensible inverted file
Information Systems
Just in time indexing for up to the second search
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Hybrid index maintenance for contiguous inverted lists
Information Retrieval
Efficient online index construction for text databases
ACM Transactions on Database Systems (TODS)
Collection selection: ...now, with more documents!
Proceedings of the 3rd international conference on Scalable information systems
On-line index maintenance using horizontal partitioning
Proceedings of the 18th ACM conference on Information and knowledge management
Scalable, statistical storage allocation for extensible inverted file construction
Journal of Systems and Software
Timestamp-based result cache invalidation for web search engines
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A hybrid approach to index maintenance in dynamic text retrieval systems
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Many information retrieval systems use the inverted file as indexing structure. The inverted file, however, requires inefficient reorganization when new documents are to be added to an existing collection. Most studies suggest dealing with this problem by sparing free space in an inverted file for incremental updates. In this paper, we propose a run-time statistics-based approach to allocate the spare space. This approach estimates the space requirements in an inverted file using only a little most recent statistical data on space usage and document update request rate. For best indexing speed and space efficiency, the amount of the spare space to be allocated is determined by adaptively balancing the trade-offs between reorganization reduction and space utilization. Experiment results show that the proposed space-sparing approach significantly avoids reorganization in updating an inverted file, and in the meantime, unused free space can be well controlled such that the file access speed is not affected.