Summarizing data using a similarity based mountain method

  • Authors:
  • Ronald R. Yager;Dimitar P. Filev

  • Affiliations:
  • Machine Intelligence Institute, Iona College, New Rochelle, NY 10801, United States;Ford Motor Company, Detroit, MI 48239, United States

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.07

Visualization

Abstract

We consider the problem of summarizing a collection of data values. Here we use a mountain method like approach based on the similarities of the data. Fundamental to our work is the possibility of allowing for multiple summarizing values. We present an algorithm, in the spirit of the mountain method, that uses the similarity between the data points to find focus points which serve as the seed for finding summarizing centers. Central to this algorithm is a process of reducing the energy of the data points which we show can be implemented most generally using a t-norm. We provide an application of the algorithm to the problem of binning data which is used in data mining and the development of histograms. Here we allow the location of the bins to be determined by the data rather then fixed a priori.