Instance-based sentence boundary determination by optimization for natural language generation

Authors:
Shimei Pan;James C. Shaw
Affiliations:
IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY
Venue:
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Year:
2005

Citing 8
Cited 1

Automated authoring of coherent multimedia discourse in conversation systems

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Computer generation of multiparagraph English text

Computational Linguistics
A rational reconstruction of the proteus sentence planner

ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Segregatory coordination and ellipsis in text generation

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
An optimization-based approach to dynamic data content selection in intelligent multimedia interfaces

Proceedings of the 17th annual ACM symposium on User interface software and technology
Instance-based natural language generation

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Trainable sentence planning for complex information presentation in spoken dialog systems

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Automated generation of graphic sketches by example

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Natural language query recommendation in conversation systems

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a novel instance-based sentence boundary determination method for natural language generation that optimizes a set of criteria based on examples in a corpus. Compared to existing sentence boundary determination approaches, our work offers three significant contributions. First, our approach provides a general domain independent framework that effectively addresses sentence boundary determination by balancing a comprehensive set of sentence complexity and quality related constraints. Second, our approach can simulate the characteristics and the style of naturally occurring sentences in an application domain since our solutions are optimized based on their similarities to examples in a corpus. Third, our approach can adapt easily to suit a natural language generation system's capability by balancing the strengths and weaknesses of its subcomponents (e.g. its aggregation and referring expression generation capability). Our final evaluation shows that the proposed method results in significantly better sentence generation outcomes than a widely adopted approach.