Detecting pitch accents at the word, syllable and vowel level
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Semi-supervised learning for automatic prosodic event detection using co-training algorithm
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
N-best rescoring based on pitch-accent patterns
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Analysis of inconsistencies in cross-lingual automatic ToBI tonal accent labeling
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Cross-lingual English Spanish tonal accent labeling using decision trees and neural networks
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Analysis of inter-transcriber consistency in the Cat_ToBI prosodic labeling system
Speech Communication
Enriching machine-mediated speech-to-speech translation using contextual information
Computer Speech and Language
A fuzzy classifier to deal with similarity between labels on automatic prosodic labeling
Computer Speech and Language
Hi-index | 0.00 |
In this paper, we describe a maximum entropy-based automatic prosody labeling framework that exploits both language and speech information. We apply the proposed framework to both prominence and phrase structure detection within the Tones and Break Indices (ToBI) annotation scheme. Our framework utilizes novel syntactic features in the form of supertags and a quantized acoustic-prosodic feature representation that is similar to linear parameterizations of the prosodic contour. The proposed model is trained discriminatively and is robust in the selection of appropriate features for the task of prosody detection. The proposed maximum entropy acoustic-syntactic model achieves pitch accent and boundary tone detection accuracies of 86.0% and 93.1% on the Boston University Radio News corpus, and, 79.8% and 90.3% on the Boston Directions corpus. The phrase structure detection through prosodic break index labeling provides accuracies of 84% and 87% on the two corpora, respectively. The reported results are significantly better than previously reported results and demonstrate the strength of maximum entropy model in jointly modeling simple lexical, syntactic, and acoustic features for automatic prosody labeling.