A statistical similarity measure
SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
Recent trends in automatic information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Generalized vector spaces model in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Precision Weighting—An Effective Automatic Indexing Method
Journal of the ACM (JACM)
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Probabilistic document indexing from relevance feedback data
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic learning approach for document indexing
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
A theory of term weighting based on exploratory data analysis
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Mining officially unrecognized side effects of drugs by combining web search and machine learning
Proceedings of the 14th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Two methods are given to improve weighting schemes by using relevance information of a set of queries. The first method is to estimate parameter values of two independence models in information retrieval — the binary independence model and the non-binary independence model. The parameters estimated here are used to calculate optimal weights for terms in a different set of queries. Performance of this estimation is compared to the inverse document frequency method, the cosine measure, and the statistical similarity measure. The second method is to learn optimal weights of the non-binary independence model adaptively by a learning formula. Experiments are performed on three different document collections CISI, MEDLARS, and CRN4NUL for both methods, and results are reported. Both methods show improvements compared to the existing weighting schemes. Experimental results show that the second method gives slightly better performance than the first one, and has simpler implementation.