Word association norms, mutual information, and lexicography
Computational Linguistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Discovering word senses from text
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
A quantitative evaluation of global word sense induction
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Latent semantic word sense induction and disambiguation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Latent vector weighting for word meaning in context
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
MaxMax: a graph-based soft clustering algorithm applied to word sense induction
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Towards mobile language evolution exploitation
Multimedia Tools and Applications
Hi-index | 0.00 |
In this paper, an extension of a dimensionality reduction algorithm called non-negative matrix factorization is presented that combines both 'bag of words' data and syntactic data, in order to find semantic dimensions according to which both words and syntactic relations can be classified. The use of three way data allows one to determine which dimension(s) are responsible for a certain sense of a word, and adapt the corresponding feature vector accordingly, 'subtracting' one sense to discover another one. The intuition in this is that the syntactic features of the syntax-based approach can be disambiguated by the semantic dimensions found by the bag of words approach. The novel approach is embedded into clustering algorithms, to make it fully automatic. The approach is carried out for Dutch, and evaluated against EuroWordNet.