Automatic analysis of descriptive texts

  • Authors:
  • James R. Cowie

  • Affiliations:
  • University of Strathclyde, Royal College, Glasgow, Scotland

  • Venue:
  • ANLC '83 Proceedings of the first conference on Applied natural language processing
  • Year:
  • 1983

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper describes a system that attempts to interpret descriptive texts without the use of complex grammars. The purpose of the system is to transform the descriptions to a standard form which may be used as the basis of a database system knowledgeable in the subject matter of the text.The texts currently used are wild plant descriptions taken directly from a popular book on the subject. Properties such as size, shape and colour are abstracted from the descriptions and related to parts of the plant in which we are interested. The resulting output is a standardised hierarchical structure holding only significant features of the description.The system, implemented in the PROLOG programming language, uses keywords to identify the way segments of the text relate to the object described. Information on words is held in a keyword list of nouns relating to parts of the object described. A dictionary contains the attributes of ordinary words used by the system to analyse the text. The text is divided into segments using information provided by conjunctions and punctuation.About half the texts processed are correctly analysed at present. Proposals are made for future work to improve this figure. There seems to be no inherent reason why the technique cannot be generalised so that any text of semi-standard descriptions can be automatically converted to a canonical form.