Reuse and challenges in evaluating language generation systems: position paper

Authors:
Kalina Bontcheva
Affiliations:
University of Sheffield, Sheffield, UK
Venue:
Evalinitiatives '03 Proceedings of the EACL 2003 Workshop on Evaluation Initiatives in Natural Language Processing: are evaluation methods, metrics and resources reusable?
Year:
2003

Citing 7
Cited 0

Participating in explanatory dialogues: interpreting and responding to questions in context

Participating in explanatory dialogues: interpreting and responding to questions in context
Designing Web Usability: The Practice of Simplicity

Designing Web Usability: The Practice of Simplicity
Evaluating Natural Language Processing Systems: An Analysis and Review

Evaluating Natural Language Processing Systems: An Analysis and Review
Empirically estimating order constraints for content planning in generation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Evaluation metrics for generation

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Inferring strategies for sentence ordering in multidocument news summarization

Journal of Artificial Intelligence Research
Dealing with dependencies between content planning and surface realisation in a pipeline generation architecture

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although there is an increasing shift towards evaluating Natural Language Generation (NLG) systems, there are still many NLG-specific open issues that hinder effective comparative and quantitative evaluation in this field. The paper starts off by describing a task-based, i.e., black-box evaluation of a hypertext NLG system. Then we examine the problem of glass-box, i.e., module specific, evaluation in language generation, with focus on evaluating machine learning methods for text planning.