Markov parsing: lattice rescoring with a statistical parser

Authors:
Brian Roark
Affiliations:
AT&T Shannon Laboratory, NJ
Venue:
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Year:
2002

Citing 6
Cited 4

Foundations of statistical natural language processing

Foundations of statistical natural language processing
A design principles of a weighted finite-state transducer library

Theoretical Computer Science - Special issue on implementing automata
Exploiting syntactic structure for natural language modeling

Exploiting syntactic structure for natural language modeling
Robust probabilistic predictive syntactic processing: motivations, models, and applications

Robust probabilistic predictive syntactic processing: motivations, models, and applications
Probabilistic top-down parsing and language modeling

Computational Linguistics
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics

Head-driven parsing for word lattices

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Early deletion of fillers in processing conversational speech

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Syntactic language modeling with formal grammars

Speech Communication
Large vocabulary Russian speech recognition using syntactico-statistical language modeling

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a generalization of an incremental statistical parsing algorithm that allows for the re-scoring of lattices of word hypotheses, for use by a speech recognizer. This approach contrasts with other lattice parsing algorithms, which either do not provide scores for strings in the lattice (i.e. they just produce parse trees) or use search techniques (e.g. A-star) to find the best paths through the lattice, without re-scoring every arc. We show that a very large efficiency gain can be had in processing 1000-best lists without reducing word accuracy when the lists are encoded in lattices instead of trees. Further, this allows for processing arbitrary lattices without n-best extraction. This can lead to more interesting methods of combination with other models, both acoustic and language, through, for example, adaptation or confusion matrices.