Missing or absent? A Question in Cost-sensitive Decision Tree

Authors:
Zhenxing Qin;Shichao Zhang;Chengqi Zhang
Affiliations:
Faculty of Information Technology, University of Technology, Sydney, PO Box 123, Broadway, Sydney, NSW 2007, Australia, {zqin, zhangsc, chengqi}@it.uts.edu.au;Faculty of Information Technology, University of Technology, Sydney, PO Box 123, Broadway, Sydney, NSW 2007, Australia, {zqin, zhangsc, chengqi}@it.uts.edu.au;Faculty of Information Technology, University of Technology, Sydney, PO Box 123, Broadway, Sydney, NSW 2007, Australia, {zqin, zhangsc, chengqi}@it.uts.edu.au
Venue:
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Year:
2006

Citing 13
Cited 0

Statistical analysis with missing data

Statistical analysis with missing data
Unknown attribute values in induction

Proceedings of the sixth international workshop on Machine learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Machine Learning

Machine Learning
Learning cost-sensitive active classifiers

Artificial Intelligence
Pruning Improves Heuristic Search for Cost-Sensitive Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Techniques for Dealing with Missing Values in Classification

IDA '97 Proceedings of the Second International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data
Polishing Blemishes: Issues in Data Correction

IEEE Intelligent Systems
Decision trees with minimal costs

ICML '04 Proceedings of the twenty-first international conference on Machine learning
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees

IEEE Transactions on Knowledge and Data Engineering
Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm

Journal of Artificial Intelligence Research
Dynamic test-sensitive decision trees with multiple cost scales

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Cost-Sensitive decision trees with multiple cost scales

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

One common source of error in data is the existence of missing value fields. Imputation method has been a widely used technique in preprocessing phase of data mining, in which missing values are replaced by some estimated values. Previous work is trying to seek the “original” values according to specific criteria, such as statistics measure. However, in domain of cost-sensitive learning, minimal overall cost is the most important issue, i.e. a value which can minimize total cost is prefer than the “best” value upon common sense. For example, in medical domains, some data fields usually are left as absent and known information is enough for a decision. In this paper, we proposed a new method to study the problem of “missing or absent values?” in the domain cost-sensitive learning. Experiment results show some improvements with distinguished missing and absent data in cost-sensitive decision tree.