Information Processing and Management: an International Journal - Special issue: data compression
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient and effective spam filtering and re-ranking for large web datasets
Information Retrieval
Hi-index | 0.00 |
The time cost of searching with an inverted index is directly proportional to the number of postings processed and the cost of processing each posting. Dynamic pruning reduces the number of postings examined. Pre-calculation then quantization of term / document weights reduces the cost of evaluating each posting. The effect of quantization on precision, latency, and index size is examined herein. We show empirically that there is an ideal size (in bits) for storing the quantized scores. Increasing this adversely affects index size and search latency; decreasing it adversely affects precision. We observe a relationship between the collection size and ideal quantization size, and provide a way to determine the number of bits to use from the collection size.