Word representations: a simple and general method for semi-supervised learning

Authors:
Joseph Turian;Lev Ratinov;Yoshua Bengio
Affiliations:
Université de Montréal, Montréal, Québec, Canada;University of Illinois at Urbana-Champaign, Urbana, IL;Université de Montréal, Montréal, Québec, Canada
Venue:
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Year:
2010

Citing 26
Cited 31

Using latent semantic analysis to improve access to textual information

CHI '88 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Class-based n-gram models of natural language

Computational Linguistics
Algorithms for bigram and trigram word clustering

Speech Communication
Latent dirichlet allocation

The Journal of Machine Learning Research
A neural probabilistic language model

The Journal of Machine Learning Research
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Hierarchical clustering of words

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A robust risk minimization based named entity recognition system

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A high-performance semi-supervised learning method for text chunking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Three new graphical models for statistical language modelling

Proceedings of the 24th international conference on Machine learning
A unified architecture for natural language processing: deep neural networks with multitask learning

Proceedings of the 25th international conference on Machine learning
Curriculum learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Design challenges and misconceptions in named entity recognition

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Multilingual dependency learning: a huge feature engineering method to semantic dependency parsing

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning: Shared Task
Enhancing unlexicalized parsing performance using a wide coverage lexicon, fuzzy tag-set mapping, and EM-HMM-based lexical probabilities

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Semi-supervised sequence modeling with syntactic topic models

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Distributional representations for handling sparsity in supervised sequence-labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Phrase clustering for discriminative learning

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Improving generative statistical parsing with semi-supervised word clustering

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Semi-supervised semantic role labeling using the latent words language model

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
An empirical study of semi-supervised structured conditional models for dependency parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From frequency to meaning: vector space models of semantics

Journal of Artificial Intelligence Research

Learning word vectors for sentiment analysis

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Semi-supervised relation extraction with large-scale word clustering

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Piggyback: using search engines for robust cross-domain named entity recognition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Part-of-speech tagging for Twitter: annotation, features, and experiments

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Learning condensed feature representations from large unsupervised data sets for supervised learning

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Language models as representations for weakly-supervised NLP tasks

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Learning discriminative projections for text similarity measures

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Passage retrieval for incorporating global evidence in sequence labeling

Proceedings of the 20th ACM international conference on Information and knowledge management
Natural Language Processing (Almost) from Scratch

The Journal of Machine Learning Research
The CMU-ARK German-English translation system

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Factored translation with unsupervised word clusters

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Enhancing clinical concept extraction with distributional semantics

Journal of Biomedical Informatics
Evaluating unsupervised learning for natural language processing tasks

EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Semi-supervised recursive autoencoders for predicting sentiment distributions

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Enhancing Chinese word segmentation using unlabeled data

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-cutting models of lexical semantics

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Named entity recognition in tweets: an experimental study

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A Bayesian approach to unsupervised semantic role induction

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Generalization methods for in-domain and cross-domain opinion holder extraction

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Cross-lingual word clusters for direct transfer of linguistic structure

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Graph-based lexicon expansion with sparsity-inducing penalties

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Deep unsupervised feature learning for natural language processing

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
UTDHLT: COPACETIC system for choosing plausible alternatives

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Penn: using word similarities to better estimate sentence similarity

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Clinical entity recognition using structural support vector machines with rich features

Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informatics
Nudging the envelope of direct transfer methods for multilingual named entity recognition

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hierarchical clustering of word class distributions

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Improving word representations via global context and multiple word prototypes

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A comparison of vector-based representations for semantic composition

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Biased representation learning for domain adaptation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Unsupervised feature adaptation for cross-domain NLP with an application to compositionality grading

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

If we take an existing supervised NLP system, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words on both NER and chunking. We use near state-of-the-art supervised baselines, and find that each of the three word representations improves the accuracy of these baselines. We find further improvements by combining different word representations. You can download our word features, for off-the-shelf use in existing NLP systems, as well as our code, here: http://metaoptimize.com/projects/wordreprs/