Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Probabilistic and genetic algorithms in document retrieval
Communications of the ACM
Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
On rotations and the generation of binary trees
Journal of Algorithms
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A new method of weighting query terms for ad-hoc retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A theory of term weighting based on exploratory data analysis
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Crossover improvement for the genetic algorithm in information retrieval
Information Processing and Management: an International Journal
Precision Weighting—An Effective Automatic Indexing Method
Journal of the ACM (JACM)
Proceedings of the 5th international conference on Intelligent user interfaces
Applying genetic algorithms to query optimization in document retrieval
Information Processing and Management: an International Journal
A vector space model for automatic indexing
Communications of the ACM
Employing the resolution power of search keys
Journal of the American Society for Information Science and Technology
Improvement of HITS-based algorithms on web documents
Proceedings of the 11th international conference on World Wide Web
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Information Retrieval
Reexamining tf.idf based information retrieval with genetic programming
SAICSIT '02 Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology
Query Optimization in Information Retrieval Using Genetic Algorithms
Proceedings of the 5th International Conference on Genetic Algorithms
A generic ranking function discovery framework by genetic programming for information retrieval
Information Processing and Management: an International Journal
Journal of the American Society for Information Science and Technology
Feature selection and feature extraction for text categorization
HLT '91 Proceedings of the workshop on Speech and Natural Language
Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space
Artificial Intelligence Review
Manual and evolutionary equalization in text mining
SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
Artificial Intelligence Review
Learning in a pairwise term-term proximity framework for information retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Neighborhood counting for financial time series forecasting
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
The adaptive web
Human-competitive results produced by genetic programming
Genetic Programming and Evolvable Machines
Learning Aggregation Functions for Expert Search
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Examining the information retrieval process from an inductive perspective
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
The effect of query length on normalisation in information retrieval
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
A novel term weighting scheme based on discrimination power obtained from past retrieval results
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper describes a method, using Genetic Programming, to automatically determine term weighting schemes for the vector space model. Based on a set of queries and their human determined relevant documents, weighting schemes are evolved which achieve a high average precision. In Information Retrieval (IR) systems, useful information for term weighting schemes is available from the query, individual documents and the collection as a whole.We evolve term weighting schemes in both local (within-document) and global (collection-wide) domains which interact with each other correctly to achieve a high average precision. These weighting schemes are tested on well-known test collections and are compared to the traditional tf-idf weighting scheme and to the BM25 weighting scheme using standard IR performance metrics.Furthermore, we show that the global weighting schemes evolved on small collections also increase average precision on larger TREC data. These global weighting schemes are shown to adhere to Luhn's resolving power as both high and low frequency terms are assigned low weights. However, the local weightings evolved on small collections do not perform as well on large collections. We conclude that in order to evolve improved local (within-document) weighting schemes it is necessary to evolve these on large collections.