Effects of the neural network s-sigmoid function on KDD in the presence of imprecise data

Authors:
John F. Kros;Mike Lin;Marvin L. Brown
Affiliations:
Department of Decision Sciences, East Carolina University, School of Business, Greenville, NC;Department of Computer and Information Science, School of Business, Cleveland State University, Cleveland, OH;Department of Computer Information Systems, College of Business, Grambling State University, Grambling, LA
Venue:
Computers and Operations Research
Year:
2006

Citing 8
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
Analysis and applications of artificial neural networks

Analysis and applications of artificial neural networks
Data mining

Data mining
Data mining: building competitive advantage

Data mining: building competitive advantage
Neural, Novel and Hybrid Algorithms for Time Series Prediction

Neural, Novel and Hybrid Algorithms for Time Series Prediction
Machine Learning

Machine Learning
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Pattern Recognition and Neural Networks

Pattern Recognition and Neural Networks

Towards optimal use of incomplete classification data

Computers and Operations Research
A study on the use of imputation methods for experimentation with Radial Basis Function Network classifiers handling missing attribute values: The good synergy between RBFNs and EventCovering method

Neural Networks

Quantified Score

Hi-index	0.01

Visualization

Abstract

This research explores a specific step in the Knowledge Discovery of Databases (KDD) process, Data Mining. The actual data mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Therefore, this analysis depends heavily on the accuracy of the database and on the chosen sample data to be used for model training and testing. Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable percentage of inaccurate data, pollution, outliers and noise. The issue of missing data must be addressed as ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the Effects of the Neural Network s-Sigmoid Function on KDD in the Presence of Imprecise Data using a three factor ANOVA test and Tukey's Honestly Significant Difference statistics.This research investigates the accuracy and impact of Data Imputation Methodologies that are employed when a specific Data Mining algorithm is utilized within a Knowledge Discovery In Databases (KDD) process. This study will employ certain Knowledge Discovery processes that are widely accepted in both the academic and commercial worlds. This work includes testing the impact of missing data on the Neural Network s-Sigmoid Transfer Function type in the Data Mining process, by experimenting with three factors: imputation method, data set size, and level of data missingness.