Generating basic skills reports for low-skilled readers*
Natural Language Engineering
System building cost vs. output quality in data-to-text generation
ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Building a semantically transparent corpus for the generation of referring expressions
INLG '06 Proceedings of the Fourth International Natural Language Generation Conference
The TUNA challenge 2008: overview and evaluation results
INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
A simple domain-independent probabilistic approach to generation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
The first surface realisation shared task: overview and evaluation results
ENLG '11 Proceedings of the 13th European Workshop on Natural Language Generation
Generating non-projective word order in statistical linearization
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Studies assessing rating scales are very common in psychology and related fields, but are rare in NLP. In this paper we assess discrete and continuous scales used for measuring quality assessments of computer-generated language. We conducted six separate experiments designed to investigate the validity, reliability, stability, interchangeability and sensitivity of discrete vs. continuous scales. We show that continuous scales are viable for use in language evaluation, and offer distinct advantages over discrete scales.