From Word Form Surfaces to Communication

  • Authors:
  • Roland Hausser

  • Affiliations:
  • Abteilung Computerlinguistik, Universität Erlangen-Nürnberg, Bismarckstr. 6, 91054 Erlangen, Germany, rrh@linguistik.uni-erlangen.de

  • Venue:
  • Proceedings of the 2010 conference on Information Modelling and Knowledge Bases XXI
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The starting point of this paper is the external surface of a word form, for example the agent-external acoustic perturbations constituting a language sign in speech or the dots on paper in the case of written language. The external surfaces are modality-dependent tokens which the hearer recognizes by means of (i) pattern-matching and (ii) a mapping into modality-independent types, and which the speaker produces by an inverse mapping from modality-independent types into tokens synthesized in a modality of choice. The types are provided by a lexicon stored in the agent's memory. They include not only the necessary as opposed to accidental (kata sumbebêkos), as used in the philosophical tradition of Aristotle. properties of the surface shape, but also the associated morphosyntactic properties and the meaning. The question addressed by this paper is how to design the lexical analysis of word form types as a data structure (abstract data type), suitable for the purpose of Database Semantics (DBS), i.e., for a computational model of natural language communication. Database Semantics describes the procedural aspects of the SLIM theory of language[1, p. 1]. As an acronym, SLIM stands for the principles of Surface compositional, time Linear, Internal Matching. As a word, SLIM stands for low (linear) mathematical complexity. After discussing the conditions of automatic word form recognition and production in a talking robot, we turn to the question of what format the analyzed word forms should have. The requirements are an easy coding of lexical details, a simple detection and representation of semantic relations, suitability for storage and retrieval in a database, support of a computationally straightforward matching procedure for relating the levels of language and context, and compatibility with a suitable algorithm.