On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Building a large-scale annotated Chinese corpus
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Tagging text with a probabilistic model
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Contrastive estimation: training log-linear models on unlabeled data
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Unsupervised learning of field segmentation models for information extraction
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Learning from measurements in exponential families
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Semi-supervised learning for natural language processing
HLT-Tutorials '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Tutorial Abstracts
Weakly supervised supertagging with grammar-informed initialization
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Evaluating unsupervised part-of-speech tagging for grammar induction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Learning-based named entity recognition for morphologically-rich, resource-scarce languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Weakly supervised part-of-speech tagging for morphologically-rich, resource-scarce languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Modeling annotators: a generative approach to learning from annotator rationales
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised constraint driven learning for transliteration discovery
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Interactive annotation learning with indirect feature voting
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Generalized isotonic conditional random fields
Machine Learning
Active learning by labeling features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Generalized expectation criteria for bootstrapping extractors using record-text alignment
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A simple unsupervised learner for POS disambiguation rules given only a minimal lexicon
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
On the use of virtual evidence in conditional random fields
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data
The Journal of Machine Learning Research
Wordica: Emergence of linguistic representations for words by independent component analysis
Natural Language Engineering
Coreference resolution in a modular, entity-centered model
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Painless unsupervised learning with features
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Starting from scratch in semantic role labeling
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improved unsupervised POS induction through prototype discovery
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
SVD and clustering for unsupervised POS tagging
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
Unsupervised Part-of-Speech Tagging in the Large
Research on Language and Computation
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Crouching Dirichlet, hidden Markov model: unsupervised POS tagging with context local tag generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Two decades of unsupervised POS induction: how far have we come?
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Latent-descriptor clustering for unsupervised POS induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Simple type-level unsupervised POS tagging
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Constructing reference sets from unstructured, ungrammatical text
Journal of Artificial Intelligence Research
Automatic part of speech tagging for Arabic: an experiment using Bigram hidden Markov model
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
A comparison of unsupervised methods for part-of-speech tagging in Chinese
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Computational Linguistics
A hierarchical Pitman-Yor process HMM for unsupervised part of speech induction
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised Russian POS tagging with appropriate context
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Controlling complexity in part-of-speech induction
Journal of Artificial Intelligence Research
Unsupervised multilingual learning
Unsupervised multilingual learning
Evaluating unsupervised learning for natural language processing tasks
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised bilingual POS tagging with Markov random fields
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised structure prediction with non-parallel multilingual guidance
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A Bayesian mixture model for part-of-speech induction using multiple features
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Adaptive Bayesian HMM for Fully Unsupervised Chinese Part-of-Speech Induction
ACM Transactions on Asian Language Information Processing (TALIP)
Incorporating lexical priors into topic models
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised learning on an approximate corpus
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A hierarchical dirichlet process model for joint part-of-speech and morphology induction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
The PASCAL Challenge on Grammar Induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Turning the pipeline into a loop: iterated unsupervised dependency parsing and PoS induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Bootstrapping a unified model of lexical and phonetic acquisition
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Bootstrapping via graph propagation
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Automatic event extraction with structured preference modeling
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A cost sensitive part-of-speech tagging: differentiating serious errors from minor errors
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Learning syntactic categories using paradigmatic representations of word context
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Building a lightweight semantic model for unsupervised information extraction on short listings
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Learning to map into a universal POS tagset
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Wiki-ly supervised part-of-speech tagging
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We investigate prototype-driven learning for primarily unsupervised sequence modeling. Prior knowledge is specified declaratively, by providing a few canonical examples of each target annotation label. This sparse prototype information is then propagated across a corpus using distributional similarity features in a log-linear generative model. On part-of-speech induction in English and Chinese, as well as an information extraction task, prototype features provide substantial error rate reductions over competitive baselines and outperform previous work. For example, we can achieve an English part-of-speech tagging accuracy of 80.5% using only three examples of each tag and no dictionary constraints. We also compare to semi-supervised learning and discuss the system's error trends.