Using approximate reduct and LVQ in case generation for CBR classifiers

  • Authors:
  • Yan Li;Simon Chi-Keung Shiu;Sankar Kumar Pal;James Nga-Kwok Liu

  • Affiliations:
  • College of Mathematics and Computer, Hebei University, Baoding City, Hebei Province, China and Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong;Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong;Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India;Department of Computing, Hong Kong Polytechnic University, Kowloon, Hong Kong

  • Venue:
  • Transactions on rough sets VII
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Case generation is a process of extracting representative cases to form a compact case base. In order to build competent and efficient CBR classifiers, we develop a case generation approach which integrates fuzzy sets, rough sets and learning vector quantization (LVQ). If the feature values of the cases are numerical, fuzzy sets are firstly used to discretize the feature spaces. Secondly, a fast rough set-based feature selection method is applied to identify the significant features. Different from the traditional discernibility function-based methods, the feature reduction method is based on a new concept of approximate reduct. The representative cases (prototypes) are then generated through LVQ learning process on the case bases after feature selection. LVQ is the supervised version of self-organizing map (SOM), which is more suitable to classification problems. Finally, a few of prototypes are generated as the representative cases of the original case base. These prototypes can be also considered as the extracted knowledge which improves the understanding of the case base. Three real life data are used in the experiments to demonstrate the effectiveness of this case generation approach. Several evaluation indices, such as classification accuracy, the storage space, case retrieval time and clustering performance in terms of intro-similarity and inter-similarity, are used in these testing.