Robust discourse parsing via discourse markers, topicality and position

Authors:
Frank Schilder
Affiliations:
Department for Informatics, University of Hamburg, Vogt-Kölln-Str. 30, 22527 Hamburg, Germany
Venue:
Natural Language Engineering
Year:
2002

Citing 8
Cited 7

New Methods in Automatic Extracting

Journal of the ACM (JACM)
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Multiple discourse marker occurrence: creating hierarchies for natural language generation

Proceedings of the workshop on Student research
Identifying topics by position

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
An Underspecified Segmented Discourse Representation Theory (USDRT)

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Discourse relations: a structural and presuppositional account using lexicalised TAG

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A decision-based approach to rhetorical parsing

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Rhetorical parsing with underspecification and forests

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Surfaces and depths in text understanding: the case of newspaper commentary

HLT-NAACL-TEXTMEANING '03 Proceedings of the HLT-NAACL 2003 workshop on Text meaning - Volume 9
Machine-assisted rhetorical structure annotation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Efficient processing of underspecified discourse representations

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
A novel discourse parser based on support vector machine classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
A novel discriminative framework for sentence-level discourse analysis

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Discourse structure and language technology

Natural Language Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a simple discourse parsing and analysis algorithm that combines a formal underspecification utilising discourse grammar with Information Retrieval (IR) techniques. First, linguistic knowledge based on discourse markers is used to constrain a totally underspecified discourse representation. Then, the remaining underspecification is further specified by the computation of a topicality score for every discourse unit. This computation is done via the vector space model. Finally, the sentences in a prominent position (e.g. the first sentence of a paragraph) are given an adjusted topicality score. The proposed algorithm was evaluated by applying it to a text summarisation task. Results from a psycholinguistic experiment, indicating the most salient sentences for a given text as the ‘gold standard’, show that the algorithm performs better than commonly used machine learning and statistical approaches to summarisation.