A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Tagging English text with a probabilistic model
Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Learning structured prediction models: a large margin approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
An evaluation exercise for word alignment
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Log-linear models for wide-coverage CCG parsing
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
On-line EM Algorithm for the Normalized Gaussian Network
Neural Computation
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Fully distributed EM for very large datasets
Proceedings of the 25th international conference on Machine learning
Online large-margin training of syntactic and structural translation features
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Online EM for unsupervised models
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Fast, easy, and cheap: construction of statistical machine translation models with MapReduce
StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Optimal distributed online prediction using mini-batches
The Journal of Machine Learning Research
Hope and fear for discriminative training of statistical translation models
The Journal of Machine Learning Research
A Named Entity Recognition Method Based on Decomposition and Concatenation of Word Chunks
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
Recent speed-ups for training large-scale models like those found in statistical NLP exploit distributed computing (either on multicore or "cloud" architectures) and rapidly converging online learning algorithms. Here we aim to combine the two. We focus on distributed, "mini-batch" learners that make frequent updates asynchronously (Nedic et al., 2001; Langford et al., 2009). We generalize existing asynchronous algorithms and experiment extensively with structured prediction problems from NLP, including discriminative, unsupervised, and non-convex learning scenarios. Our results show asynchronous learning can provide substantial speedups compared to distributed and single-processor mini-batch algorithms with no signs of error arising from the approximate nature of the technique.