A joint language model with fine-grain syntactic tags

Authors:
Denis Filimonov;Mary Harper
Affiliations:
University of Maryland, College Park;University of Maryland, College Park and Johns Hopkins University
Venue:
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Year:
2009

Citing 10
Cited 10

A tree-based statistical language model for natural language speech recognition

Readings in speech recognition
Class-based n-gram models of natural language

Computational Linguistics
Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition
Algorithms for bigram and trigram word clustering

Speech Communication
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A second-order Hidden Markov Model for part-of-speech tagging

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Factored language models and generalized parallel backoff

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
The SuperARV language model: investigating the effectiveness of tightly integrating multiple knowledge sources

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A variable-length category-based n-gram language model

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Self-training PCFG grammars with latent annotations across languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2

Contextual information improves OOV detection in speech

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Learning simple Wikipedia: a cogitation in ascertaining abecedarian language

CL&W '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
Self-training with products of latent variable grammars

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Lessons learned in part-of-speech tagging of conversational speech

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Generalized interpolation in decision tree LM

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Syntactic decision tree LMs: random selection or intelligent design?

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A fast re-scoring strategy to capture long-distance dependencies

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Fast syntactic analysis for statistical language modeling via substructure sharing and uptraining

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Revisiting the case for explicit syntactic information in language models

WLM '12 Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT
Approximate inference: A sampling based modeling technique to capture complex dependencies in a language model

Speech Communication

Quantified Score

Hi-index	0.01

Visualization

Abstract

We present a scalable joint language model designed to utilize fine-grain syntactic tags. We discuss challenges such a design faces and describe our solutions that scale well to large tagsets and corpora. We advocate the use of relatively simple tags that do not require deep linguistic knowledge of the language but provide more structural information than POS tags and can be derived from automatically generated parse trees - a combination of properties that allows easy adoption of this model for new languages. We propose two fine-grain tagsets and evaluate our model using these tags, as well as POS tags and SuperARV tags in a speech recognition task and discuss future directions.