Prototype reduction schemes applicable for non-stationary data sets

  • Authors:
  • Sang-Woon Kim;B. John Oommen

  • Affiliations:
  • Department of Computer Science and Engineering, Myongji University, Yongin 449-728, Korea;School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6

  • Venue:
  • Pattern Recognition
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

All of the prototype reduction schemes (PRS) which have been reported in the literature, process time-invariant data to yield a subset of prototypes that are useful in nearest-neighbor-like classification. Although these methods have been proven to be powerful, they suffer from a major disadvantage when they are utilized for applications involving non-stationary data, namely, time varying samples, typical of video and multimedia applications. In this paper, we suggest two PRS mechanisms which, in turn, are suitable for two distinct models of non-stationarity. In the first model, the data points obtained at discrete time steps, are individually assumed to be perturbed in the feature space, because of noise in the measurements or features. As opposed to this, in the second model, we assume that, at discrete time steps, new data points are available, and that these themselves are generated due to a non-stationarity in the parameters of the feature space. In both of these cases, rather than process all the data as a whole set using a PRS, we propose that the information gleaned from a previous PRS computation be enhanced to yield the prototypes for the current data set using an LVQ-3 type ''fine tuning''. The results are, to our knowledge, the first reported PRS results for non-stationary data, and can be summarized as follows: if the system obeys the first model of non-stationarity, the improved accuracy is as high as 90.98% for artificial data ''Non_normal 2'', and as high as 97.62% for the real-life data set, ''Arrhythmia''. As opposed to this, if the system obeys the second model of non-stationarity, the improved accuracy is as high as 76.30% for the artificial data, and as high as 97.40% for this real-life data set. These are, in our opinion, very impressive, considering that the data sets are truly time-varying.