Pitch accent in context: predicting intonational prominence from text
Artificial Intelligence - Special volume on natural language processing
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Active + Semi-supervised Learning = Robust Multi-View Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Selective Sampling with Redundant Views
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Use of support vector learning for chunk identification
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Using conditional random fields to predict pitch accents in conversational speech
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Unsupervised and semi-supervised learning of tone and pitch accent
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A note on Platt's probabilistic outputs for support vector machines
Machine Learning
Automatic prosodic events detection using syllable-based acoustic and syntactic features
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
On the syllabification of phonemes
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Semi-supervised learning for automatic prosodic event detection using co-training algorithm
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
N-best rescoring based on pitch-accent patterns
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification
IEEE Transactions on Audio, Speech, and Language Processing
Prosody dependent speech recognition on radio news corpus of American English
IEEE Transactions on Audio, Speech, and Language Processing
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
Hi-index | 0.00 |
Most previous approaches to automatic prosodic event detection are based on supervised learning, relying on the availability of a corpus that is annotated with the prosodic labels of interest in order to train the classification models. However, creating such resources is an expensive and time-consuming task. In this paper, we exploit semi-supervised learning with the co-training algorithm for automatic detection of coarse-level representation of prosodic events such as pitch accent, intonational phrase boundaries, and break indices. Since co-training works on the condition that the views are compatible and uncorrelated, and real data often do not satisfy these conditions, we propose a method to label and select examples in co-training. In our experiments on the Boston University radio news corpus, when using only a small amount of the labeled data as the initial training set, our proposed labeling method can effectively use unlabeled data to improve performance and finally reach performance close to the results of the supervised method using more labeled data. We perform a thorough analysis of various factors impacting the learning curves, including labeling error rate and informativeness of added examples, performance of the individual classifiers and their difference, and the initial and added data size.