Feature weighting for co-occurrence-based classification of words

Authors:
Viktor Pekar;Michael Krkoska;Steffen Staab
Affiliations:
U. of Wolverhampton, Wolverhampton, UK;Mentasys GmbH, Karlsruhe, Germany;U. of Karlsruhe, Karlsruhe, Germany
Venue:
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Year:
2004

Citing 9
Cited 5

The automatic identification of stop words

Journal of Information Science
C4.5: programs for machine learning

C4.5: programs for machine learning
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Feature Subset Selection in Text-Learning

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Similarity-based methods for word sense disambiguation

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Boosting automatic lexical acquisition with morphological information

ULA '02 Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9

Extending the single words-based document model: a comparison of bigrams and 2-itemsets

Proceedings of the 2006 ACM symposium on Document engineering
Clustering Narrow-Domain Short Texts by Using the Kullback-Leibler Distance

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Bootstrapping distributional feature vector quality

Computational Linguistics
Cross-lingual Romanian to english question answering at CLEF 2006

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Weighted Association Rule Mining for Video Semantic Detection

International Journal of Multimedia Data Engineering & Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper comparatively studies methods of feature weighting in application to the task of cooccurrence-based classification of words according to their meaning. We explore parameter optimization of several weighting methods frequently used for similar problems such as text classification. We find that successful application of all the methods crucially depends on a number of parameters; only a carefully chosen weighting procedure allows to obtain consistent improvement on a classifier learned from non-weighted data.