A novel virtual sample generation method based on Gaussian distribution

Authors:
Jing Yang;Xu Yu;Zhi-Qiang Xie;Jian-Pei Zhang
Affiliations:
College of Computer Science and Technology, Harbin Engineering University, Harbin, China;College of Computer Science and Technology, Harbin Engineering University, Harbin, China;College of Computer Science and Technology, Harbin Engineering University, Harbin, China and College of Computer Science and Technology, Harbin University of Science and Technology, Harbin, China;College of Computer Science and Technology, Harbin Engineering University, Harbin, China
Venue:
Knowledge-Based Systems
Year:
2011

Citing 14
Cited 2

Training with noise is equivalent to Tikhonov regularization

Neural Computation
Support-Vector Networks

Machine Learning
Prior knowledge in support vector kernels

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Noisy replication in skewed binary classification

Computational Statistics & Data Analysis
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries

Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries
The effects of adding noise during backpropagation training on a generalization performance

Neural Computation
A non-linearly virtual sample generation technique using group discovery and parametric equations of hypersphere

Expert Systems with Applications: An International Journal
Combining neural networks and semantic feature space for email classification

Knowledge-Based Systems
A knowledge-based decision support system to analyze the debris-flow problems at Chen-Yu-Lan River, Taiwan

Knowledge-Based Systems
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A neural network approach for solving linear bilevel programming problem

Knowledge-Based Systems
Short communication: Multi-fault classification based on support vector machine trained by chaos particle swarm optimization

Knowledge-Based Systems
An overview of statistical learning theory

IEEE Transactions on Neural Networks

On the effectiveness of preprocessing methods when dealing with different levels of class imbalance

Knowledge-Based Systems
A new approach for manufacturing forecast problems with insufficient data: the case of TFT---LCDs

Journal of Intelligent Manufacturing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional machine learning algorithms are not with satisfying generalization ability on noisy, imbalanced, and small sample training set. In this work, a novel virtual sample generation (VSG) method based on Gaussian distribution is proposed. Firstly, the method determines the mean and the standard error of Gaussian distribution. Then, virtual samples can be generated by such Gaussian distribution. Finally, a new training set is constructed by adding the virtual samples to the original training set. This work has shown that training on the new training set is equivalent to a form of regularization regarding small sample problems, or cost-sensitive learning regarding imbalanced sample problems. Experiments show that given a suitable number of virtual sample replicates, the generalization ability of the classifiers on the new training sets can be better than that on the original training sets.