Learning word senses with feature selection and order identification capabilities

Authors:
Zheng-Yu Niu;Dong-Hong Ji;Chew-Lim Tan
Affiliations:
Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;National University of Singapore, Singapore
Venue:
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Year:
2004

Citing 14
Cited 2

Floating search methods in feature selection

Pattern Recognition Letters
Unsupervised Feature Selection Using Feature Similarity

IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection as a Preprocessing Step for Hierarchical Clustering

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Weighting in k-Means Clustering

Machine Learning
Feature Selection for Clustering - A Filter Solution

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Word sense disambiguation in untagged text based on term weight learning

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Discovering corpus-specific word senses

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Unsupervised methods for developing taxonomies by combining syntactic and statistical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Resampling Method for Unsupervised Estimation of Cluster Validity

Neural Computation

Chinese information retrieval based on terms and relevant terms

ACM Transactions on Asian Language Information Processing (TALIP)
A semi-supervised feature clustering algorithm with application to word sense disambiguation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an unsupervised word sense learning algorithm, which induces senses of target word by grouping its occurrences into a "natural" number of clusters based on the similarity of their contexts. For removing noisy words in feature set, feature selection is conducted by optimizing a cluster validation criterion subject to some constraint in an unsupervised manner. Gaussian mixture model and Minimum Description Length criterion are used to estimate cluster structure and cluster number. Experimental results show that our algorithm can find important feature subset, estimate model order (cluster number) and achieve better performance than another algorithm which requires cluster number to be provided.