Choosing the content of textual summaries of large time-series data sets

  • Authors:
  • Jin Yu;Ehud Reiter;Jim Hunter;Chris Mellish

  • Affiliations:
  • Department of Computing Science, University of Aberdeen Aberdeen AB24 3UE, UK e-mail: jyu@csd.abdn.ac.uk, ereiter@csd.abdn.ac.uk, jhunter@csd.abdn.ac.uk, cmellish@csd.abdn.ac.uk;Department of Computing Science, University of Aberdeen Aberdeen AB24 3UE, UK e-mail: jyu@csd.abdn.ac.uk, ereiter@csd.abdn.ac.uk, jhunter@csd.abdn.ac.uk, cmellish@csd.abdn.ac.uk;Department of Computing Science, University of Aberdeen Aberdeen AB24 3UE, UK e-mail: jyu@csd.abdn.ac.uk, ereiter@csd.abdn.ac.uk, jhunter@csd.abdn.ac.uk, cmellish@csd.abdn.ac.uk;Department of Computing Science, University of Aberdeen Aberdeen AB24 3UE, UK e-mail: jyu@csd.abdn.ac.uk, ereiter@csd.abdn.ac.uk, jhunter@csd.abdn.ac.uk, cmellish@csd.abdn.ac.uk

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textualsummaries of sensor data from gas turbines.