Predicting the need for CT imaging in children with minor head injury using an ensemble of Naive Bayes classifiers

Authors:
William Klement;Szymon Wilk;Wojtek Michalowski;Ken J. Farion;Martin H. Osmond;Vedat Verter
Affiliations:
Ottawa-Carleton School of Computer Science, University of Ottawa, 800 King Edward Ave., Ottawa, Ontario, K1N 6N5 Canada and MET Research Group, Telfer School of Management, University of Ottawa, 5 ...;MET Research Group, Telfer School of Management, University of Ottawa, 55 Laurier Ave. E., Ottawa, Ontario, K1N 6N5 Canada and Institute of Computing Science, Poznan University of Technology, ul. ...;MET Research Group, Telfer School of Management, University of Ottawa, 55 Laurier Ave. E., Ottawa, Ontario, K1N 6N5 Canada;MET Research Group, Telfer School of Management, University of Ottawa, 55 Laurier Ave. E., Ottawa, Ontario, K1N 6N5 Canada and Division of Emergency Medicine, Children's Hospital of Eastern Ontari ...;Division of Emergency Medicine, Children's Hospital of Eastern Ontario, 401 Smyth Rd., Ottawa, Ontario, K1H 8L1 Canada and Department of Pediatrics, University of Ottawa, 401 Smyth Rd., Ottawa, On ...;Desautels Faculty of Management, McGill University, 001 Sherbrooke St. W. Montreal, Quebec, H3A 1G5 Canada
Venue:
Artificial Intelligence in Medicine
Year:
2012

Citing 24
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Information Retrieval

Information Retrieval
Learning When Negative Examples Abound

ECML '97 Proceedings of the 9th European Conference on Machine Learning
Class Probability Estimation and Cost-Sensitive Classification Decisions

ECML '02 Proceedings of the 13th European Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On the Use of Self-Organizing Maps for Clustering and Visualization

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Cost-Sensitive Learning by Cost-Proportionate Example Weighting

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Class imbalances versus small disjuncts

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Selective Pre-processing of Imbalanced Data for Improving Classification Performance

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Guidelines to Select Machine Learning Scheme for Classification of Biomedical Datasets

EvoBIO '09 Proceedings of the 7th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Learning from Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploratory undersampling for class-imbalance learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An empirical study of the behavior of classifiers on imbalanced and overlapped data sets

CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Integrating selective pre-processing of imbalanced data with Ivotes ensemble

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Classifying severely imbalanced data

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Severe class imbalance: why better algorithms aren't the answer

ECML'05 Proceedings of the 16th European conference on Machine Learning
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Uniqueness of medical data mining

Artificial Intelligence in Medicine

Review: Knowledge discovery in medicine: Current issue and future trend

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objective: Using an automatic data-driven approach, this paper develops a prediction model that achieves more balanced performance (in terms of sensitivity and specificity) than the Canadian Assessment of Tomography for Childhood Head Injury (CATCH) rule, when predicting the need for computed tomography (CT) imaging of children after a minor head injury. Methods and materials: CT is widely considered an effective tool for evaluating patients with minor head trauma who have potentially suffered serious intracranial injury. However, its use poses possible harmful effects, particularly for children, due to exposure to radiation. Safety concerns, along with issues of cost and practice variability, have led to calls for the development of effective methods to decide when CT imaging is needed. Clinical decision rules represent such methods and are normally derived from the analysis of large prospectively collected patient data sets. The CATCH rule was created by a group of Canadian pediatric emergency physicians to support the decision of referring children with minor head injury to CT imaging. The goal of the CATCH rule was to maximize the sensitivity of predictions of potential intracranial lesion while keeping specificity at a reasonable level. After extensive analysis of the CATCH data set, characterized by severe class imbalance, and after a thorough evaluation of several data mining methods, we derived an ensemble of multiple Naive Bayes classifiers as the prediction model for CT imaging decisions. Results: In the first phase of the experiment we compared the proposed ensemble model to other ensemble models employing rule-, tree- and instance-based member classifiers. Our prediction model demonstrated the best performance in terms of AUC, G-mean and sensitivity measures. In the second phase, using a bootstrapping experiment similar to that reported by the CATCH investigators, we showed that the proposed ensemble model achieved a more balanced predictive performance than the CATCH rule with an average sensitivity of 82.8% and an average specificity of 74.4% (vs. 98.1% and 50.0% for the CATCH rule respectively). Conclusion: Automatically derived prediction models cannot replace a physician's acumen. However, they help establish reference performance indicators for the purpose of developing clinical decision rules so the trade-off between prediction sensitivity and specificity is better understood.