Self organizing maps constrained by data structures

  • Authors:
  • Cesar A. Astudillo

  • Affiliations:
  • Carleton University (Canada)

  • Venue:
  • Self organizing maps constrained by data structures
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Within the field of Pattern Recognition (PR) and Machine Intelligence (MI), when one requires useful information from a set of stimuli, the task usually demands the deduction of its structure and stochastic distribution. This endeavor becomes especially challenging when the stimuli belongs to a higher dimensional domain, and its cardinality is large. Through the last few decades, researchers have tried to solve this problem and have faced numerous difficulties, particularly when the learning process is performed without the intervention of a human being. The state-of-the-art records remarkable efforts in the field of Artificial Neural Networks (ANNs) that follow the latter paradigm. Among the set of ANNs, the Self-Organizing Map (SOM), pioneered by Kohonen, is unique due to its interesting theoretical capabilities — which have profound practical significance. However, it is known that under various circumstances, the SOM fails to represent the data accurately. This thesis presents new families of self-organizing ANNs. They have been designed with the goal of overcoming some of the reported handicaps of the SOM. First of all, the thesis contains a complete survey of the field that pertains to SOMs, which is a contribution to the community in its own right. We then propose a method by which a user-defined tree automatically adapts so as to absorb the essential properties of the stimuli, while it, simultaneously, preserves the original properties of the feature space. The resultant tree reveals multi-resolution capabilities, which are helpful for representing the original data set with different numbers of points. These desirable properties are advantageously utilized to derive classifiers that are capable of learning from labeled and unlabeled samples simultaneously, and the PR implications are demonstrated by a rigorous set of experiments. The thesis thereafter contains a pioneering attempt to merge the areas of the ANNs with the theory of Adaptive Data Structures (ADSs). This is accomplished by considering how the underlying tree itself can be rendered dynamic and adaptively transformed. Again, the PR implications of this are also explored. Finally, the thesis also incorporates a hyperplane-based partitioning scheme to accelerate the time required to identify the winner neuron, which is a process that is central to any SOM-based strategy.