Convexity dependant hyperbolic filtering scheme for mode detection in pattern classification

  • Authors:
  • R. Allaoui;A. Sbihi

  • Affiliations:
  • Faculté des Sciences, University Ibn Tofaïl, LIRF, BP 133, 14000 Kénitra, Morocco. E-mail: abderrahmane.sbihi@caramail.com, rabha.allaoui@caramail.com;Faculté des Sciences, University Ibn Tofaïl, LIRF, BP 133, 14000 Kénitra, Morocco. E-mail: abderrahmane.sbihi@caramail.com, rabha.allaoui@caramail.com

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Among most of the existing procedures for mode detection of the underlying probability density function (pdf), preliminary to unsupervised statistical clustering, the ones that research modes as regions where the pdf is concave remain very interesting approaches. These techniques make use of a test that determines locally the convexity of the underlying pdf from the input patterns. However, the test area of sampling points may straddle a boundary between a convex region and a concave one, so that the assumptions for the test of the local convexity can be violated. Furthermore, this local test of convexity is very sensitive to details in the data structure and would rapidly become impracticable as the dimensionality of the data increases. The present paper presents a new alternative based on the global convexity analysis instead the local convexity testing. A recursive separable hyperbolic filter, used as the principal tool for this proposed technique, is generalized to a multidimensional space. This filter is with a reliability criterion allowing to model as well the pdf variations as the noise attached to the density function. Based on the characteristic theorem of convexity, the proposed technique assigns the concave label to modal regions and the convex label to valleys of the pdf according to an adequate hyperbolic filtering scheme. Modes are then extracted as concave connected components corresponding to the clusters in the mixture, and are used to assign the available observations to the clusters attached to them. Experimental results, using real and artificially generated data sets with various complexities, demonstrate the effectiveness of the proposed method, which requires neither a starting classification, nor an a priori number of clusters or their distribution.