Arithmetic coding for data compression
Communications of the ACM
Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
httperf—a tool for measuring web server performance
ACM SIGMETRICS Performance Evaluation Review
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Impact transformation: effective and efficient web retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Local versus global link information in the Web
ACM Transactions on Information Systems (TOIS)
Ranking Function Optimization for Effective Web Search by Genetic Programming: An Empirical Study
HICSS '04 Proceedings of the Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICSS'04) - Track 4 - Volume 4
An efficient boosting algorithm for combining preferences
The Journal of Machine Learning Research
IEEE Transactions on Knowledge and Data Engineering
Journal of the American Society for Information Science and Technology
Simplified similarity scoring using term ranks
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Pruned query evaluation using pre-computed impacts
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Listwise approach to learning to rank: theory and algorithm
Proceedings of the 25th international conference on Machine learning
Term Impacts as Normalized Term Frequencies for BM25 Similarity Scoring
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Learning concept importance using a weighted dependence model
Proceedings of the third ACM international conference on Web search and data mining
Early exit optimizations for additive machine learned ranking systems
Proceedings of the third ACM international conference on Web search and data mining
Caching search engine results over incremental indices
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Universal codeword sets and representations of the integers
IEEE Transactions on Information Theory
Hi-index | 0.00 |
State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to improve search efficiency in textual databases is to precompute term impacts at indexing time. In this article, we propose a novel alternative to precompute term impacts, providing a generic framework for combining any distinct set of sources of evidence by using a machine-learning technique. This method retains the advantages of producing high-quality results, but avoids the costs of combining evidence at query-processing time. Our method, called Learn to Precompute Evidence Fusion (LePrEF), uses genetic programming to compute a unified precomputed impact value for each term found in each document prior to query processing, at indexing time. Compared with previous research on precomputing term impacts, our method offers the advantage of providing a generic framework to precompute impact using any set of relevance evidence at any text collection, whereas previous research articles do not. The precomputed impact values are indexed and used later for computing document ranking at query-processing time. By doing so, our method effectively reduces the query processing to simple additions of such impacts. We show that this approach, while leading to results comparable to state-of-the-art ranking methods, also can lead to a significant decrease in computational costs during query processing. © 2012 Wiley Periodicals, Inc.