Optimization of control parameters for genetic algorithms
IEEE Transactions on Systems, Man and Cybernetics
The Use of Background Knowledge in Decision Tree Induction
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reduction and axiomization of covering generalized rough sets
Information Sciences: an International Journal
Test-Cost Sensitive Naive Bayes Classification
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
Test Strategies for Cost-Sensitive Decision Trees
IEEE Transactions on Knowledge and Data Engineering
Attribute reduction in decision-theoretic rough set models
Information Sciences: an International Journal
A hierarchical model for test-cost-sensitive decision systems
Information Sciences: an International Journal
Journal of Artificial Intelligence Research
Generating better decision trees
IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
An optimization viewpoint of decision-theoretic rough set model
RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Test-cost-sensitive attribute reduction
Information Sciences: an International Journal
Hi-index | 0.00 |
Learning from data with test cost and misclassification cost has been a hot topic in data mining. Many algorithms have been proposed to induce decision trees for this purpose. This paper studies a number of such algorithms and presents a competition strategy to obtain trees with lower cost. First, we generate a population of decision trees using λ-ID3 and EG2 algorithms through considering information gain and test cost. λ-ID3 is a generalization of three existing algorithms, namely ID3, IDX, and CS-ID3. EG2 is another parameterized algorithm, and its parameter range is extended in this work. Second, we post-prune these trees by considering the tradeoff between the test cost and the misclassification cost. Finally, we select the best decision tree for classification. Experimental results on the mushroom dataset with various cost settings indicate: 1) there does not exist an optimal parameter for λ-ID3 or EG2; 2) the competition strategy is effective in selecting an appropriate decision tree; and 3) post-pruning can help decreasing the average cost effectively.