A Study in Granular Computing: On Classifiers Induced from Granular Reflections of Data

  • Authors:
  • Lech Polkowski;Piotr Artiemjew

  • Affiliations:
  • Polish-Japanese Institute of Information Technology, Warszawa, Poland 02008;University of Warmia and Mazury, Olsztyn, Poland 10560

  • Venue:
  • Transactions on Rough Sets IX
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Granular Computing as a paradigm in the area of ApproximateReasoning/Soft Computing, goes back to the idea of L. A. Zadeh(1979) of computing with collections of similar entities. Bothfuzzy and rough set theories are immanently occupied with granulesas atomic units of knowledge are inverse images of fuzzy membershipfunctions in the first and indiscernibility classes in the otherset theory. Research on granulation in the framework of rough set theory hasstarted soon after Zadeh's program manifest (T.Y. Lin, L.Polkowski,Qing Liu, A.Skowron, J.Stepaniuk, Y.Y.Yao) with various tools fromgeneral theory of binary relations (T.Y.Lin, Y.Y.Yao), roughmereology (L.Polkowski, A.Skowron), approximation spaces (A.Skowron and J. Stepaniuk), logics for approximate reasoning(L.Polkowski, M. Semeniuk-Polkowska, Qing Liu). The program of granular computing requires that granules formedfrom entities described by data should enter computing process aselementary units of computation; this program has been pursued insome aspects of reasoning under uncertainty like fusion ofknowledge, rough---neural computing, many agent systems. In this work, granules of knowledge are exploited in tasks ofclassification of data. This research is a follow---up on theprogram initiated by the first author in plenary talks at IEEEInternational Conferences on Granular Computing in Beijing, 2005,and Atlanta, 2006. The idea of this program consists in granulatingdata and creating a granular data set (called the granularreflection of the original data set); due to expected in theprocess of granulation smoothing of data, eliminating of outliers,and averaging of attribute values, classification on the basis ofgranular data is expected to be of satisfactory quality, i.e.,granulation should preserve information encoded in data to asatisfactory degre. It should be stressed, however, that theproposed process of building a granular structure involves a fewrandom procedures (factoring attributes through a granule,selection of a granular covering of the universe of objects) whichmakes it difficult for a rigorous analysis. It is the aim of this work to verify the program of granularclassification on the basis of experiments with real data. Granules of knowledge are in this work defined and computed onlines proposed by Polkowski in teh framework of rough mereology: itdoes involve usage of similarity measures called rough inclusionsalong with techniques of mereological theory of concepts. Inconsequence, definitions of granules are invariant with respect tothe choice of the underlying similarity measure. Granules of knowledge enter the realm of classification problemsin this work from a three---fold perspective: first, granulateddata sets give rise to new data sets on which classifiers aretested and the results are compared to results obtained with thesame classifiers on the original data sets; next, granules oftraining objects as well as granules of rules obtained from thetraining set vote for value of decision at a test object; this isrepeated with granules of granular reflections of granules and withgranules of rules obtained from granulated data sets. Finally, thevoting is augmented with weights resulting from the distribution ofattribute values between the test object and training objects. In the first case, the rough inclusion based on Hamming’smetric is applied (or, equivalently, it is the rough inclusionproduced from the archimedean t–norm of Łukasiewicz); inthe last two cases, rough inclusions are produced on the basis ofresidual implications induced from continuous t–norms ofŁukasiewicz, the product t–norm, and the minimumt–norm, respectively. In all cases results of experiments on chosen real data sets,most often used as a test data for rough set methods, are verysatisfactory, and, in some cases, offer results better than manyother rough set based classification methods.