C4.5: programs for machine learning
C4.5: programs for machine learning
Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
A framework for multiple-instance learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Machine Learning
Multiple-Instance Learning for Natural Scene Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Inference for the Generalization Error
Machine Learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Generic Object Recognition with Boosting
IEEE Transactions on Pattern Analysis and Machine Intelligence
MILES: Multiple-Instance Learning via Embedded Instance Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Revisiting Multiple-Instance Learning Via Embedded Instance Selection
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
MIForests: multiple-instance learning with randomized trees
ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Speeding up and boosting diverse density learning
DS'10 Proceedings of the 13th international conference on Discovery science
Hi-index | 0.00 |
MITI is a simple and elegant decision tree learner designed for multi-instance classification problems, where examples for learning consist of bags of instances. MITI grows a tree in best-first manner by maintaining a priority queue containing the unexpanded nodes in the fringe of the tree. When the head node contains instances from positive examples only, it is made into a leaf, and any bag of data that is associated with this leaf is removed. In this paper we first revisit the basic algorithm and consider the effect of parameter settings on classification accuracy, using several benchmark datasets. We show that the chosen splitting criterion in particular can have a significant effect on accuracy. We identify a potential weakness of the algorithm--subtrees can contain structure that has been created using data that is subsequently removed--and show that a simple modification turns the algorithm into a rule learner that avoids this problem. This rule learner produces more compact classifiers with comparable accuracy on the benchmark datasets we consider. Finally, we present randomized algorithm variants that enable us to generate ensemble classifiers. We show that these can yield substantially improved classification accuracy.