Using three way data for word sense discrimination

Authors:
Tim Van de Cruys
Affiliations:
University of Groningen
Venue:
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Year:
2008

Citing 7
Cited 5

Word association norms, mutual information, and lexicography

Computational Linguistics
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Verbs semantics and lexical selection

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1

A quantitative evaluation of global word sense induction

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Latent semantic word sense induction and disambiguation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Latent vector weighting for word meaning in context

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
MaxMax: a graph-based soft clustering algorithm applied to word sense induction

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Towards mobile language evolution exploitation

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, an extension of a dimensionality reduction algorithm called non-negative matrix factorization is presented that combines both 'bag of words' data and syntactic data, in order to find semantic dimensions according to which both words and syntactic relations can be classified. The use of three way data allows one to determine which dimension(s) are responsible for a certain sense of a word, and adapt the corresponding feature vector accordingly, 'subtracting' one sense to discover another one. The intuition in this is that the syntactic features of the syntax-based approach can be disambiguated by the semantic dimensions found by the bag of words approach. The novel approach is embedded into clustering algorithms, to make it fully automatic. The approach is carried out for Dutch, and evaluated against EuroWordNet.