Discriminative language modeling with conditional random fields and the perceptron algorithm

Authors:
Brian Roark;Murat Saraclar;Michael Collins;Mark Johnson
Affiliations:
AT&T Labs -- Research;AT&T Labs -- Research;MIT CSAIL;Brown University
Venue:
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Year:
2004

Citing 10
Cited 28

Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Table extraction using conditional random fields

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Estimators for stochastic "Unification-Based" grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Generalized algorithms for constructing statistical language models

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

New developments in parsing technology

Introduction to the special issue on statistical language modeling

ACM Transactions on Asian Language Information Processing (TALIP)
Scaling conditional random fields using error-correcting codes

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Discriminative syntactic language modeling for speech recognition

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Discriminative pruning of language models for Chinese word segmentation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Capitalizing machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Discriminative n-gram language modeling

Computer Speech and Language
Feature forest models for probabilistic hpsg parsing

Computational Linguistics
Constrained optimization for validation-guided conditional random field learning

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Recognition of noisy speech: a comparative survey of robust model architecture and feature enhancement

EURASIP Journal on Audio, Speech, and Music Processing
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Competitive generative models with structure learning for NLP classification tasks

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Discriminative models for semi-supervised natural language learning

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Using syntactic coupling features for discriminating phrase-based translations (WMT-08 shared translation task)

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Perceptron reranking for CCG realization

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Unsupervised discriminative language model training for machine translation using simulated confusion sets

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Designing agreement features for realization ranking

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Confidence-weighted learning of factored discriminative language models

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Joint reranking of parsing and word recognition with automatic segmentation

Computer Speech and Language
Acoustically discriminative language model training with pseudo-hypothesis

Speech Communication
Syntactic discriminative language model rerankers for statistical machine translation

Machine Translation
Efficient inference in large conditional random fields

ECML'06 Proceedings of the 17th European conference on Machine Learning
Creating disjunctive logical forms from aligned sentences for grammar-based paraphrase generation

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Glue rules for robust chart realization

ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
Linguistically motivated complementizer choice in surface realization

UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
Efficient training of discriminative language models by sample selection

Speech Communication
Adaptation of statistical machine translation model for cross-lingual information retrieval in a service context

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Large-scale discriminative language model reranking for voice-search

WLM '12 Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs). The models are encoded as deterministic weighted finite state automata, and are applied by intersecting the automata with word-lattices that are the output from a baseline recognizer. The perceptron algorithm has the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data. However, using the feature set output from the perceptron algorithm (initialized with their weights), CRF training provides an additional 0.5% reduction in word error rate, for a total 1.8% absolute reduction from the baseline of 39.2%.