Automatic analysis of descriptive texts

Authors:
James R. Cowie
Affiliations:
University of Strathclyde, Royal College, Glasgow, Scotland
Venue:
ANLC '83 Proceedings of the first conference on Applied natural language processing
Year:
1983

Citing 4
Cited 3

Programming in Prolog

Programming in Prolog
Natural Language Information Processing: A Computer Grammmar of English and Its Applications

Natural Language Information Processing: A Computer Grammmar of English and Its Applications
Computer Models of Thought and Language

Computer Models of Thought and Language
A taxonomy for English nouns and verbs

ACL '81 Proceedings of the 19th annual meeting on Association for Computational Linguistics

Writing to be searched: A workshop on document creation principles

ACM SIGIR Forum
Information extraction

Communications of the ACM
Information Extraction and Knowledge Acquisition from Texts Using Bilingual Question–Answering

Journal of Intelligent and Robotic Systems

Quantified Score

Hi-index	0.02

Visualization

Abstract

This paper describes a system that attempts to interpret descriptive texts without the use of complex grammars. The purpose of the system is to transform the descriptions to a standard form which may be used as the basis of a database system knowledgeable in the subject matter of the text.The texts currently used are wild plant descriptions taken directly from a popular book on the subject. Properties such as size, shape and colour are abstracted from the descriptions and related to parts of the plant in which we are interested. The resulting output is a standardised hierarchical structure holding only significant features of the description.The system, implemented in the PROLOG programming language, uses keywords to identify the way segments of the text relate to the object described. Information on words is held in a keyword list of nouns relating to parts of the object described. A dictionary contains the attributes of ordinary words used by the system to analyse the text. The text is divided into segments using information provided by conjunctions and punctuation.About half the texts processed are correctly analysed at present. Proposals are made for future work to improve this figure. There seems to be no inherent reason why the technique cannot be generalised so that any text of semi-standard descriptions can be automatically converted to a canonical form.