An LVQ connectionist solution to the non-determinacy problem in Arabic morphological analysis: a learning hybrid algorithm

  • Authors:
  • M. A. El-affendi

  • Affiliations:
  • Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11543, PO Box 51178, Saudi Arabia/ email: affendi@ccis.ksu.edu.sa

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the morphological properties of derivational Arabic words are encapsulated in their corresponding morphological patterns. The morphological pattern is a template that shows how the word should be decomposed into its constituent morphemes (prefix + stem + suffix), and at the same time, marks the positions of the radicals comprising the root of the word. The number of morphological patterns in Arabic is finite and is well below 1000. Due to these properties, most of the current analysis algorithms concentrate on discovering the morphological pattern of the input word as a major step in recognizing the type and category of the word. Unfortunately, this process is non-determinitic in the sense that the underlying search process may sometimes associate more than one morphological pattern with the given word, all of them satisfying the major lexical constraints. One solution to this problem is to use a collection of connectionist pattern associaters that uniquely associate each word with its corresponding morphological pattern. This paper describes an LVQ-based learning pattern association system that uniquely maps a given Arabic word to its corresponding morphological pattern, and therefore deduces its morphological properties. The system consists of a collection of hetroassociative models that are trained using the LVQ algorithm plus a collection of autoassociative models that have been trained using backpropagation. Experimental results have shown that the system is fairly accurate and very easy to train. The LVQ algorithm has been chosen because it is very easy to train and the implied training time is very small compared to that of backpropagation.