Selecting Samples and Features for SVM Based on Neighborhood Model

Authors:
Qinghua Hu;Daren Yu;Zongxia Xie
Affiliations:
Harbin Institute of Technology, Harbin 150001, P.R. China;Harbin Institute of Technology, Harbin 150001, P.R. China;Harbin Institute of Technology, Harbin 150001, P.R. China
Venue:
RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Year:
2009

Citing 13
Cited 0

Support-Vector Networks

Machine Learning
Relational interpretations of neighborhood operators and rough set approximation operators

Information Sciences—Informatics and Computer Science: An International Journal
Neighborhood systems and relational databases

CSC '88 Proceedings of the 1988 ACM sixteenth annual conference on Computer science
Using Rough Sets with Heuristics for Feature Selection

Journal of Intelligent Information Systems
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Gene Selection for Cancer Classification using Support Vector Machines

Machine Learning
SVM-KM: Speeding SVMs Learning with a priori Cluster Selection and k-Means

SBRN '00 Proceedings of the VI Brazilian Symposium on Neural Networks (SBRN'00)
Consistency-based search in feature selection

Artificial Intelligence
Invariance of neighborhood relation under input space to feature space mapping

Pattern Recognition Letters
Information-preserving hybrid data reduction based on fuzzy-rough techniques

Pattern Recognition Letters
Fast pattern selection for support vector classifiers

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Sample selection via clustering to construct support vector-like classifiers

IEEE Transactions on Neural Networks
A study on reduced support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Support vector machine (SVM) is a class of popular learning algorithms for good generalization. However, it is time-consuming in training SVM with a large set of samples. How to improve learning efficiency is one of the most important research tasks. It is known although there are many candidate training samples in learning tasks only the samples near decision boundary have influence on classification hyperplane. Finding these samples and training SVM with them may greatly decrease time and space complexity in training. Based on the observation, we introduce neighborhood based rough set model to search boundary samples. With the model, we divide a sample space into two subsets: positive region and boundary samples. What's more, we also partition the features into several subsets: strongly relevant features, weakly relevant and indispensable features, weakly relevant and superfluous features and irrelevant features. We train SVM with the boundary samples in the relevant and indispensable feature subspaces, therefore simultaneous feature and sample selection is conducted with the proposed model. Some experiments are performed to test the proposed method. The results show that the model can select very few features and samples for training; and the classification performances are kept or improved.