Using sensitivity analysis and visualization techniques to open black box data mining models

Authors:
Paulo Cortez;Mark J. Embrechts
Affiliations:
Centro Algoritmi, Departamento de Sistemas de Informação, Universidade do Minho, Campus de Azurém, 4800-058 Guimarães, Portugal;Department of Industrial and Systems Engineering Rensselaer Polytechnic Institute, CII 5129, Troy, NY 12180, USA
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 18
Cited 0

Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Random Forests

Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Probability Estimates for Multi-class Classification by Pairwise Coupling

The Journal of Machine Learning Research
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Visualization techniques utilizing the sensitivity analysis of models

Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Modeling wine preferences by data mining from physicochemical properties

Decision Support Systems
Mortality assessment in intensive care units via adverse events using artificial neural networks

Artificial Intelligence in Medicine
On the versatility of radial basis function neural networks: A case study in the field of intrusion detection

Information Sciences: an International Journal
Decision Support and Business Intelligence Systems

Decision Support and Business Intelligence Systems
Data mining with neural networks and support vector machines using the R/rminer tool

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Ensemble of feature sets and classification algorithms for sentiment classification

Information Sciences: an International Journal
Symbiotic filtering for spam email detection

Expert Systems with Applications: An International Journal
Nonlinear Support Vector Machine Visualization for Risk Factor Analysis Using Nomograms and Localized Radial Basis Function Kernels

IEEE Transactions on Information Technology in Biomedicine
The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks

IEEE Transactions on Neural Networks
Data strip mining for the virtual design of pharmaceuticals with neural networks

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.07

Visualization

Abstract

In this paper, we propose a new visualization approach based on a Sensitivity Analysis (SA) to extract human understandable knowledge from supervised learning black box data mining models, such as Neural Networks (NNs), Support Vector Machines (SVMs) and ensembles, including Random Forests (RFs). Five SA methods (three of which are purely new) and four measures of input importance (one novel) are presented. Also, the SA approach is adapted to handle discrete variables and to aggregate multiple sensitivity responses. Moreover, several visualizations for the SA results are introduced, such as input pair importance color matrix and variable effect characteristic surface. A wide range of experiments was performed in order to test the SA methods and measures by fitting four well-known models (NN, SVM, RF and decision trees) to synthetic datasets (five regression and five classification tasks). In addition, the visualization capabilities of the SA are demonstrated using four real-world datasets (e.g., bank direct marketing and white wine quality).