Optimization for dynamic inverted index maintenance
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Incremental updates of inverted lists for text document retrieval
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Efficient distributed algorithms to build inverted files
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Fast Incremental Indexing for Full-Text Information Retrieval
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient single-pass index construction for text databases
Journal of the American Society for Information Science and Technology
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Connections: using context to enhance file search
Proceedings of the twentieth ACM symposium on Operating systems principles
Fast on-line index construction by geometric partitioning
Proceedings of the 14th ACM international conference on Information and knowledge management
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Implementing sorting in database systems
ACM Computing Surveys (CSUR)
Efficient online index maintenance for contiguous inverted lists
Information Processing and Management: an International Journal
Hybrid index maintenance for growing text collections
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Just in time indexing for up to the second search
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient on-line index maintenance for dynamic text collections by using dynamic balancing tree
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Hybrid index maintenance for contiguous inverted lists
Information Retrieval
AWOL: an adaptive write optimizations layer
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Efficient online index construction for text databases
ACM Transactions on Database Systems (TODS)
A hybrid approach to index maintenance in dynamic text retrieval systems
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Search in social networks with access control
Proceedings of the 2nd International Workshop on Keyword Search on Structured Data
Workload-aware indexing for keyword search in social networks
Proceedings of the 20th ACM international conference on Information and knowledge management
Fast and incremental indexing in effective and efficient XML element retrieval systems
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Dynamic memory allocation policies for postings in real-time Twitter search
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
In dynamic environments with frequent content updates, we require online full-text search that scales to large data collections and achieves low search latency. Several recent methods that support fast incremental indexing of documents typically keep on disk multiple partial index structures that they continuously update as new documents are added. However, spreading indexing information across multiple locations on disk tends to considerably decrease the search responsiveness of the system. In the present paper, we take a fresh look at the problem of online full-text search with consideration of the architectural features of modern systems. Selective Range Flush is a greedy method that we introduce to manage the index in the system by using fixed-size blocks to organize the data on disk and dynamically keep low the cost of data transfer between memory and disk. As we experimentally demonstrate with the Proteus prototype implementation that we developed, we retrieve indexing information at latency that matches the lowest achieved by existing methods. Additionally, we reduce the total building cost by 30% in comparison to methods with similar retrieval time.