Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Grafting: fast, incremental feature selection by gradient descent in function space
The Journal of Machine Learning Research
Links between perceptrons, MLPs and SVMs
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Discriminative training and maximum entropy models for statistical machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Ranking and Reranking with Perceptron
Machine Learning
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative global training algorithm for statistical MT
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An end-to-end discriminative approach to machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hierarchical Phrase-Based Translation
Computational Linguistics
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
Beyond log-linear models: boosted minimum error rate training for N-best Re-ranking
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Online large-margin training of syntactic and structural translation features
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
11,001 new features for statistical machine translation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Using a maximum entropy model to build segmentation lattices for MT
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Joint covariate selection and joint subspace selection for multiple classification problems
Statistics and Computing
Efficient algorithms for ranking with SVMs
Information Retrieval
Distributed training strategies for the structured perceptron
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations
N-best reranking by multitask learning
WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
KenLM: faster and smaller language model queries
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
The CMU-ARK German-English translation system
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Structural and topical dimensions in multi-task patent translation
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Structured ramp loss minimization for machine translation
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evidence from machine learning indicates that increasing the training sample size results in better prediction. The goal of this paper is to show that this common wisdom can also be brought to bear upon SMT. We deploy local features for SCFG-based SMT that can be read off from rules at runtime, and present a learning algorithm that applies l1/l2 regularization for joint feature selection over distributed stochastic learning processes. We present experiments on learning on 1.5 million training sentences, and show significant improvements over tuning discriminative models on small development sets.