Emergent linguistic rules from inducing decision trees: disambiguating discourse clue words
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Knowledge Discovery with Clustering Based on Rules. Interpreting Results
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
The rhetorical parsing, summarization, and generation of natural language texts
The rhetorical parsing, summarization, and generation of natural language texts
The rhetorical parsing, summarization, and generation of natural language texts
The rhetorical parsing, summarization, and generation of natural language texts
Learning features that predict cue usage
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Cue phrase classification using machine learning
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
The problem of capturing discourse structure for complex NLP tasks has often been addressed by exploiting surface clues that can yield a partial structure of discourse. Discourse Markers (DMs) are among the most popular of these clues because they are both highly informative of discourse structure and have a very low processing cost. However, they present two main problems: first, there is a general lack of consensus about their appropriate characterisation for NLP applications, and secondly, their potential as an unexpensive source of discourse knowledge is weakened by the fact that information associated to them is usually hand-encoded. In this paper we will show how a combination of clustering techniques provides empirical evidence for a characterisation of DMs. This data-driven methodology provides generalisations helpful for reducing the cost of encoding the information associated to DMs, while increasing consistency of their characterisation.