A unified language processing methodology

  • Authors:
  • Teodor Rus

  • Affiliations:
  • Univ. of Iowa, Iowa City, IA

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2002

Quantified Score

Hi-index 5.23

Visualization

Abstract

This paper discusses a mathematical concept of language that models both artificial and natural languages and thus provides a framework for a unified language processing methodology. This concept of a language is regarded as a communication tool that allows language users to develop knowledges, while interacting with their universe of discourse, and to communicate with each other, while exchanging knowledges. Criteria for consistent usage of a language are established using a Galois connection between language syntax and language semantics. Solutions to ambiguity, paraphrase, attitude, and other problems concerning the relationship between syntax and semantics are addressed. A general schema for language specification is introduced and algorithms that perform language generation and language analysis are discussed as universal tools defined by the specification schema. Language transformations performed by various kinds of translators are examined and correctness criteria of these translators are defined using the language Galois connection. The paper is structured as follows: Section1 introduces the framework and justifies the necessity of a unified methodology for language processing. Section2 presents the mathematical concept of a language. Section3 illustrates the mathematical concept of a language with three kinds of language structures: natural language, logical language, and programming language. Section4 discusses the algebraic mechanism of language specification that unifies the methodology for language processing tool development. Section5 formalizes the criterion for the consistency of the language usage, defines the architecture of a unified language processing system, and shows how the consistency criteria for language usage can be employed as correctness criteria for the algorithms performing various language transformations.