Evolving General Term-Weighting Schemes for Information Retrieval: Tests on Larger Collections
Artificial Intelligence Review
Learning to rank for why-question answering
Information Retrieval
Evolutionary optimization for ranking how-to questions based on user-generated contents
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Current approaches to information retrieval rely on the creativity of individuals to develop new algorithms. In this investigation the use of genetic algorithms (GA) and genetic programming (GP) to learn IR algorithms is examined.Document structure weighting is a technique whereby different parts of a document (title, abstract, etc.) contribute unevenly to the overall document weight during ranking. Near optimal weights can be learned with a GA. Doing so shows a statistically significant 5% relative improvement in MAP for vector space inner product and Croft's probabilistic ranking, but no improvement for BM25. Two applications of this approach are suggested: offline learning, and relevance feedback.In a second set of experiments, a new ranking function was learned using GP. This new function yields a statistically significant 11% relative improvement on unseen queries tested on the training documents. Portability tests to different collections (not used in training) demonstrate the performance of the new function exceeds vector space and probability, and slightly exceeds BM25. Learning weights for this new function is proposed.The application of genetic learning to stemming and thesaurus construction is discussed. Stemming rules such as those of the Porter algorithm are candidates for GP learning whereas synonym sets are candidates for GA learning.