On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
WordNet: a lexical database for English
Communications of the ACM
Contextual correlates of synonymy
Communications of the ACM
Concept decompositions for large sparse text data using clustering
Machine Learning
Placing search in context: the concept revisited
Proceedings of the 10th international conference on World Wide Web
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Mixture models of categorization
Journal of Mathematical Psychology
Quantitative evaluation of passage retrieval algorithms for question answering
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A neural probabilistic language model
The Journal of Machine Learning Research
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Three new graphical models for statistical language modelling
Proceedings of the 24th international conference on Machine learning
A unified architecture for natural language processing: deep neural networks with multitask learning
Proceedings of the 25th international conference on Machine learning
Introduction to Information Retrieval
Introduction to Information Retrieval
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A structured vector space model for word meaning in context
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multi-prototype vector-space models of word meaning
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Word representations: a simple and general method for semi-supervised learning
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Measuring distributional similarity in context
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A mixture model with sharing for lexical semantics
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Semi-supervised recursive autoencoders for predicting sentiment distributions
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multi-space probabilistic sequence modeling
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Context-dependent conceptualization
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Chinese-English mixed text normalization
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Unsupervised word representations are very useful in NLP tasks both as inputs to learning algorithms and as extra word features in NLP systems. However, most of these models are built with only local context and one representation per word. This is problematic because words are often polysemous and global context can also provide useful information for learning word meanings. We present a new neural network architecture which 1) learns word embeddings that better capture the semantics of words by incorporating both local and global document context, and 2) accounts for homonymy and polysemy by learning multiple embeddings per word. We introduce a new dataset with human judgments on pairs of words in sentential context, and evaluate our model on it, showing that our model outperforms competitive baselines and other neural language models.