A novel approach to improving C-Tree for feature selection

Authors:
Ming Yang;Ping Yang
Affiliations:
School of Computer Science and Technology, Nanjing Normal University, Nanjing 210046, China and Jiangsu Research Center of Information Security & Privacy Technology, Nanjing 210046, China;School of Mathematical Sciences, Nanjing Normal University, Nanjing 210046, China
Venue:
Applied Soft Computing
Year:
2011

Citing 9
Cited 1

Rough computational methods for information systems

Artificial Intelligence
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Rough set methods in feature selection and recognition

Pattern Recognition Letters - Special issue: Rough sets, pattern recognition and data mining
Using Rough Sets with Heuristics for Feature Selection

RSFDGrC '99 Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing
Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches

IEEE Transactions on Knowledge and Data Engineering
Letters: A novel condensing tree structure for rough set feature selection

Neurocomputing
Incremental attribute reduction based on elementary sets

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
Rough set feature selection algorithms for textual case-based classification

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning

Optimal training subset in a support vector regression electric load forecasting model

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rough set approach is one of effective feature selection methods that can preserve the meaning of the features. So far, many feature selection (also called feature reduction) methods based on Rough set have been proposed. Of which, methods based on discernibility matrix are of considerable benefits for their conciseness and effectiveness, but have much higher space complexity. In order to reduce the storage space of the existing feature selection methods based on discernibility matrix, a novel condensing tree (C-Tree) structure was introduced, which is an extended order-tree, every nonempty element of a discernibility matrix is stored in one path in the C-Tree by given order of features and lots of nonempty elements share one path or sub-path, so the C-Tree has much lower space complexity as compared to discernibility matrix. However, the size of a C-Tree greatly depends on the order of features in most cases, hence how to set the proper order of features is of importance. To generate a higher compressed C-Tree, in this paper, after introducing an efficient trick for efficiently measuring the relative importance of every feature, we present a new feature ordering strategy according to the descending order of their importance. Further, based on the new feature ordering strategy, corresponding two heuristic algorithms for feature selection are introduced. Algorithms of this paper are experimented using six standard datasets and five synthetic datasets for testing both time and space complexities. Experimental results show that the newly improved feature selection algorithm can further efficiently reduce cost of storage in most cases.