Beyond n-grams: can linguistic sophistication improve language modeling?

Authors:
Eric Brill;Radu Florian;John C. Henderson;Lidia Mangu
Affiliations:
Johns Hopkins University, Baltimore, Md.;Johns Hopkins University, Baltimore, Md.;Johns Hopkins University, Baltimore, Md.;Johns Hopkins University, Baltimore, Md.
Venue:
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Year:
1998

Citing 3
Cited 6

Automated postediting of documents

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Inference and Estimation of a Long-Range Trigram Model

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Learning restricted probabilistic link grammars

Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing

Call classification using recurrent neural networks, support vector machines and finite state automata

Knowledge and Information Systems
Error detection using linguistic features

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

Journal of Management Information Systems
Incremental parsing with reference interaction

IncrementParsing '04 Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together
Third-party error detection support mechanisms for dictation speech recognition

Interacting with Computers
Spoken language understanding via supervised learning and linguistically motivated features

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

It seems obvious that a successful model of natural language would incorporate a great deal of both linguistic and world knowledge. Interestingly, state of the art language models for speech recognition are based on a very crude linguistic model, namely conditioning the probability of a word on a small fixed number of preceding words. Despite many attempts to incorporate more sophisticated information into the models, the n-gram model remains the state of the art, used in virtually all speech recognition systems. In this paper we address the question of whether there is hope in improving language modeling by incorporating more sophisticated linguistic and world knowledge, or whether the n-grams are already capturing the majority of the information that can be employed.