A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Probabilistic CFG with latent annotations
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Using conditional random fields for sentence boundary detection in speech
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Effective use of prosody in parsing conversational speech
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Self-training PCFG grammars with latent annotations across languages
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Lessons learned in part-of-speech tagging of conversational speech
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
This paper investigates using prosodic information in the form of ToBI break indexes for parsing spontaneous speech. We revisit two previously studied approaches, one that hurt parsing performance and one that achieved minor improvements, and propose a new method that aims to better integrate prosodic breaks into parsing. Although these approaches can improve the performance of basic probabilistic context free grammar (PCFG) parsers, they all fail to produce fine-grained PCFG models with latent annotations (PCFG-LA) (Matsuzaki et al., 2005; Petrov and Klein, 2007) that perform significantly better than the baseline PCFG-LA model that does not use break indexes, partially due to mis-alignments between automatic prosodic breaks and true phrase boundaries. We propose two alternative ways to restrict the search space of the prosodically enriched parser models to the n-best parses from the baseline PCFG-LA parser to avoid egregious parses caused by incorrect breaks. Our experiments show that all of the prosodically enriched parser models can then achieve significant improvement over the baseline PCFG-LA parser.