Communications of the ACM - Special issue on parallelism
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Instance-Based Learning Algorithms
Machine Learning
Pitch accent in context: predicting intonational prominence from text
Artificial Intelligence - Special volume on natural language processing
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Forgetting Exceptions is Harmful in Language Learning
Machine Learning - Special issue on natural language learning
Efficient progressive sampling
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Information Retrieval
Modeling local context for pitch accent prediction
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
New statistical methods for phrase break prediction
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Stochastic and syntactic techniques for predicting phrase breaks
Computer Speech and Language
Evaluation of automatic break insertion for an agglutinative and inflected language
Speech Communication
Frequency matters: pitch accents and information status
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
We train a decision tree inducer (CART) and a memory-based classifier (MBL) on predicting prosodic pitch accents and breaks in Dutch text, on the basis of shallow, easy-to-compute features. We train the algorithms on both tasks individually and on the two tasks simultaneously. The parameters of both algorithms and the selection of features are optimized per task with iterative deepening, an efficient wrapper procedure that uses progressive sampling of training data. Results show a consistent significant advantage of MBL over CART, and also indicate that task combination can be done at the cost of little generalization score loss. Tests on cross-validated data and on held-out data yield F-scores of MBL on accent placement of 84 and 87, respectively, and on breaks of 88 and 91, respectively. Accent placement is shown to outperform an informed baseline rule; reliably predicting breaks other than those already indicated by intra-sentential punctuation, however, appears to be more challenging.