A formal framework for linguistic annotation
Speech Communication - Special issue on speech annotation and corpus tools
Multi-level annotation in the Emu speech database management system
Speech Communication - Special issue on speech annotation and corpus tools
Discourse and Information Structure
Journal of Logic, Language and Information
An annotation scheme for free word order languages
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Tagging of very large corpora: topic-focus articulation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Towards a resource for lexical semantics: a large German corpus with extensive semantic annotation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
Annotating anaphoric and bridging relations with MMAX
SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
Information structure and pauses in a corpus of spoken Danish
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Hi-index | 0.00 |
We present discourse-level annotation of newspaper texts in German and English, as part of an ongoing project aimed at investigating information structure from a cross-linguistic perspective. Rather than annotating some specific notion of information structure, we propose a theory-neutral annotation of basic features at the levels of syntax, prosody and discourse, using treebank data as a starting point. Our discourse-level annotation scheme covers properties of discourse referents (e.g., semantic sort, delimitation, quantification, familiarity status) and anaphoric links (coreference and bridging). We illustrate what investigations this data serves and discuss some integration issues involved in combining different levels of stand-off annotations, created by using different tools.