A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Forest-based statistical sentence generation
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Building applied natural language generation systems
Natural Language Engineering
Generation that exploits corpus-based statistical knowledge
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Choosing words in computer-generated weather forecasts
Artificial Intelligence - Special volume on connecting language to the world
Learning for semantic parsing with statistical machine translation
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
Natural Language Engineering
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A simple domain-independent probabilistic approach to generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Comparing rating scales and preference judgements in language evaluation
INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Harvesting re-usable high-level rules for expository dialogue generation
INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Extracting parallel fragments from comparable corpora for data-to-text generation
INLG '10 Proceedings of the 6th International Natural Language Generation Conference
Assessing the trade-off between system building cost and output quality in data-to-text generation
Empirical methods in natural language generation
Introducing shared tasks to NLG: the TUNA shared task evaluation challenges
Empirical methods in natural language generation
Discrete vs. continuous rating scales for language evaluation in NLP
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Unsupervised alignment of comparable data and text resources
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Hi-index | 0.00 |
Data-to-text generation systems tend to be knowledge-based and manually built, which limits their reusability and makes them time and cost-intensive to create and maintain. Methods for automating (part of) the system building process exist, but do such methods risk a loss in output quality? In this paper, we investigate the cost/quality trade-off in generation system building. We compare four new data-to-text systems which were created by predominantly automatic techniques against six existing systems for the same domain which were created by predominantly manual techniques. We evaluate the ten systems using intrinsic automatic metrics and human quality ratings. We find that increasing the degree to which system building is automated does not necessarily result in a reduction in output quality. We find furthermore that standard automatic evaluation metrics underestimate the quality of handcrafted systems and over-estimate the quality of automatically created systems.