Adaptive term weighting through stochastic optimization

Authors:
Michael Granitzer
Affiliations:
Graz University of Technology, Graz, Austria
Venue:
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2010

Citing 11
Cited 0

Neural networks for pattern recognition

Neural networks for pattern recognition
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Evaluation of hierarchical clustering algorithms for document datasets

Proceedings of the eleventh international conference on Information and knowledge management
Online and batch learning of pseudo-metrics

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Simplified similarity scoring using term ranks

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of axiomatic approaches to information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning Term Spaces Based on Visual Feedback

DEXA '06 Proceedings of the 17th International Conference on Database and Expert Systems Applications
Adaptive Context-based term (re)weightingAn experiment on Single-Word Question Answering

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Semi-parametric and Non-parametric Term Weighting for Information Retrieval

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Term weighting strongly influences the performance of text mining and information retrieval approaches. Usually term weights are determined through statistical estimates based on static weighting schemes. Such static approaches lack the capability to generalize to different domains and different data sets. In this paper, we introduce an on-line learning method for adapting term weights in a supervised manner. Via stochastic optimization we determine a linear transformation of the term space to approximate expected similarity values among documents. We evaluate our approach on 18 standard text data sets and show that the performance improvement of a k-NN classifier ranges between 1% and 12% by using adaptive term weighting as preprocessing step. Further, we provide empirical evidence that our approach is efficient to cope with larger problems.