TEXTNET: a network-based approach to text handling
ACM Transactions on Information Systems (TOIS)
The discourse-level structure of empirical abstracts: an exploratory study
Information Processing and Management: an International Journal
DocBook: The Definitive Guide with CD-ROM
DocBook: The Definitive Guide with CD-ROM
Towards a semantics for XML markup
Proceedings of the 2002 ACM symposium on Document engineering
Pro-SGML: Ein Prolog-basiertes System zum Textretrieval
Linguistik und neue Medien [10. Jahrestagung der GLDV
Lightweight structure in text
Identifying topics by position
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
An annotation scheme for discourse-level argumentation in research articles
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory
SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
RSTTool 2.4: a markup tool for Rhetorical Structure Theory
INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Text-level structure of research papers: implications for text-based information processing systems
IRSG'97 Proceedings of the 19th Annual BCS-IRSG conference on Information Retrieval Research
A document engineering environment for clinical guidelines
Proceedings of the 2007 ACM symposium on Document engineering
Text type structure and logical document structure
DiscAnnotation '04 Proceedings of the 2004 ACL Workshop on Discourse Annotation
Multidimensional markup and heterogeneous linguistic resources
NLPXML '06 Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing
Practice theory & the foundations of digital document encoding
Proceedings of the 27th ACM international conference on Design of communication
Web-based annotation of anaphoric relations and lexical chains
LAW '07 Proceedings of the Linguistic Annotation Workshop
Requirements and an architecture for a multimedia content re-purposing framework
EC-TEL'06 Proceedings of the First European conference on Technology Enhanced Learning: innovative Approaches for Learning and Knowledge Sharing
Hi-index | 0.00 |
We present an approach on how to investigate what kind of semantic information is regularly associated with the structural markup of scientific articles. This approach addresses the need for an explicit formal description of the semantics of text-oriented XML-documents. The domain of our investigation is a corpus of scientific articles from psychology and linguistics from both English and German online available journals.For our analyses, we provide XML-markup representing two kinds of semantic levels: the thematic level (i.e.\ topics in the text world that the article is about) and the functional or rhetorical level. Our hypothesis is that these semantic levels correlate with the articles' document structure also represented in XML. Articles have been annotated with the appropriate information. Each of the three informational levels is modelled in a separate XML document, since in our domain, the different description levels might conflict so that it is impossible to model them within a single XML document.For comparing and mining the resulting multi-layered\linebreak XML annotations of one article, a Prolog-based approach is used. It focusses on the comparison of XML markup that is distributed among different documents. Prolog predicates have been defined for inferring relations between levels of information that are modelled in separate XML documents. We demonstrate how the Prolog tool is applied in our corpus analyses.