Dependency Analysis and CBR to Bridge the Generation Gap in Template-Based NLG

Authors:
Virginia Francisco;Raquel Hervás;Pablo Gervás
Affiliations:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, Spain;Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, Spain;Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, Spain
Venue:
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2009

Citing 7
Cited 1

The “GENERATION GAP”: the problem of expressibility in text planning

The “GENERATION GAP”: the problem of expressibility in text planning
WordNet: a lexical database for English

Communications of the ACM
Case Retrieval Nets: Basic Ideas and Extensions

KI '96 Proceedings of the 20th Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Bootstrapping lexical choice via multiple-sequence alignment

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Case retrieval nets for heuristic lexicalization in natural language generation

EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Case-based reasoning for knowledge-intensive template selection during text generation

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning

A case-based reasoning approach to automating the construction of multiple choice questions

ICCBR'10 Proceedings of the 18th international conference on Case-Based Reasoning Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present paper describes how dependency analysis can be used to automatically extract from a corpus a set of cases - and an accompanying vocabulary - which enable a template-based generator to achieve reasonable coverage over conceptual messages beyond the explicit scope of the templates defined in it. Details are provided on the actual process of partial automation that has been applied to obtain the case base, together with the various ingredients of the template-based generator, which applies case-based reasoning techniques. This module resorts to the taxonomy of concepts in WordNet to compute similarity between concepts involved in the texts. A case retrieval net is used as a memory model. The set of data to be converted into text acts as a query to the system. The process of solving a given query may involve several retrieval processes - to obtain a set of cases that together constitute a good solution for transcribing the data in the query as text messages - and a process of knowledge-intensive adaptation which resorts to a knowledge base to identify appropriate substitutions and completions for the concepts that appear in the cases, using the query as a source. We describe this case-based solution for selecting an appropriate set of templates to render a given set of data as text, we present numeric results of system performance in the domain of press articles, and we discuss its advantages and shortcomings.