Machine Learning
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Machine Learning
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning
Analysis of new techniques to obtain quality training sets
Pattern Recognition Letters - Special issue: Sibgrapi 2001
Enhancing Supervised Learning with Unlabeled Data
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Exploiting unlabeled data in ensemble methods
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying and Handling Mislabelled Instances
Journal of Intelligent Information Systems
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers
IEEE Transactions on Knowledge and Data Engineering
Bootstrapping coreference classifiers with multiple machine learning algorithms
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Learning Weighted Metrics to Minimize Nearest-Neighbor Classification Error
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Information Sciences: an International Journal
Improved heterogeneous distance functions
Journal of Artificial Intelligence Research
Semi-Supervised Learning
Tri-training and data editing based semi-supervised clustering algorithm
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Semi-supervised multiple classifier systems: background and research directions
MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
SETRED: self-training with editing
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Computer Methods and Programs in Biomedicine
Inter-training: Exploiting unlabeled data in multi-classifier systems
Knowledge-Based Systems
Hi-index | 0.00 |
Machine learning techniques used in computer aided diagnosis (CAD) systems learn a hypothesis to help the medical experts make a diagnosis in the future. To learn a well-performed hypothesis, a large amount of expert-diagnosed examples are required, which places a heavy burden on experts. By exploiting large amounts of undiagnosed examples and the power of ensemble learning, the co-training-style random forest (Co-Forest) releases the burden on the experts and produces well-performed hypotheses. However, the Co-forest may suffer from a problem common to other co-training-style algorithms, namely, that the unlabeled examples may instead be wrongly-labeled examples that become accumulated in the training process. This is due to the fact that the limited number of originally-labeled examples usually produces poor component classifiers, which lack diversity and accuracy. In this paper, a new Co-Forest algorithm named Co-Forest with Adaptive Data Editing (ADE-Co-Forest) is proposed. Not only does it exploit a specific data-editing technique in order to identify and discard possibly mislabeled examples throughout the co-labeling iterations, but it also employs an adaptive strategy in order to decide whether to trigger the editing operation according to different cases. The adaptive strategy combines five pre-conditional theorems, all of which ensure an iterative reduction of classification error and an increase in the scale of new training sets under PAC learning theory. Experiments on UCI datasets and an application to small pulmonary nodules detection using chest CT images show that ADE-Co-Forest can more effectively enhance the performance of a learned hypothesis than Co-Forest and DE-Co-Forest (Co-Forest with Data Editing but without adaptive strategy).