System building cost vs. output quality in data-to-text generation

Authors:
Anja Belz;Eric Kow
Affiliations:
University of Brighton, Brighton, UK;University of Brighton, Brighton, UK
Venue:
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Year:
2009

Citing 11
Cited 10

A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Forest-based statistical sentence generation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Building applied natural language generation systems

Natural Language Engineering
Generation that exploits corpus-based statistical knowledge

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Choosing words in computer-generated weather forecasts

Artificial Intelligence - Special volume on connecting language to the world
Learning for semantic parsing with statistical machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models

Natural Language Engineering
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

From data to text in the Neonatal Intensive Care Unit: Using NLG technology for decision support and information management

AI Communications
A simple domain-independent probabilistic approach to generation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Comparing rating scales and preference judgements in language evaluation

INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Harvesting re-usable high-level rules for expository dialogue generation

INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Extracting parallel fragments from comparable corpora for data-to-text generation

INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Assessing the trade-off between system building cost and output quality in data-to-text generation

Empirical methods in natural language generation
Introducing shared tasks to NLG: the TUNA shared task evaluation challenges

Empirical methods in natural language generation
Sentence generation for artificial brains: A glocal similarity-matching approach

Neurocomputing
Discrete vs. continuous rating scales for language evaluation in NLP

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Unsupervised alignment of comparable data and text resources

BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data-to-text generation systems tend to be knowledge-based and manually built, which limits their reusability and makes them time and cost-intensive to create and maintain. Methods for automating (part of) the system building process exist, but do such methods risk a loss in output quality? In this paper, we investigate the cost/quality trade-off in generation system building. We compare four new data-to-text systems which were created by predominantly automatic techniques against six existing systems for the same domain which were created by predominantly manual techniques. We evaluate the ten systems using intrinsic automatic metrics and human quality ratings. We find that increasing the degree to which system building is automated does not necessarily result in a reduction in output quality. We find furthermore that standard automatic evaluation metrics underestimate the quality of handcrafted systems and over-estimate the quality of automatically created systems.