Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Word association norms, mutual information, and lexicography
Computational Linguistics
Inference networks for document retrieval
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of an inference network-based retrieval model
ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Relevance feedback and other query modification techniques
Information retrieval
Journal of the American Society for Information Science
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A comparative study of sparse approximate inverse preconditioners
IMACS'97 Proceedings on the on Iterative methods and preconditioners
Machine learning and data mining
Communications of the ACM
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Parallel Algorithms and Matrix Computation
Parallel Algorithms and Matrix Computation
Modern Information Retrieval
Mathematical Theory of Computation
Mathematical Theory of Computation
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Hi-index | 0.00 |
One of the core components in information retrieval (IR) is the document-term-weighting scheme. In this paper, we will propose a novel learning-based term-weighting approach to improve the retrieval performance of vector space model in homogeneous collections. We first introduce a simple learning system to weighting the index terms of documents. Then, we deduce a formal computational approach according to some theories of matrix computation and statistical inference. Our experiments on 8 collections will show that our approach out-performs classic tfidf weighting, about 20%-45%.