Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples

Authors:
Ming Li;Zhi-Hua Zhou
Affiliations:
Nanjing Univ., Nanjing;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Year:
2007

Citing 0
Cited 39

Semisupervised Regression with Cotraining-Style Algorithms

IEEE Transactions on Knowledge and Data Engineering
Traffic classification using en-semble learning and co-training

AIC'08 Proceedings of the 8th conference on Applied informatics and communications
Semi-supervised document retrieval

Information Processing and Management: an International Journal
Semi-supervised Learning with Multimodal Perturbation

ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
Supervised Selective Combining Pattern Recognition Modalities and Its Application to Signature Verification by Fusing On-Line and Off-Line Kernels

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Recruiter selection model and implementation within the united states army

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
New Labeling Strategy for Semi-supervised Document Categorization

KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Semi-supervised Classification Based on Clustering Ensembles

AICI '09 Proceedings of the International Conference on Artificial Intelligence and Computational Intelligence
A data-driven approach to manage the length of stay for appendectomy patients

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
2010 Special Issue: Semi-supervised learning for tree-structured ensembles of RBF networks with Co-Training

Neural Networks
Random relevant and non-redundant feature subspaces for co-training

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Semi-supervised learning applied to large data sets with very few labeled examples

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Co-training with relevant random subspaces

Neurocomputing
A framework for microarray data-based tumor diagnostic system with improving performance incrementally

Expert Systems with Applications: An International Journal
A classification algorithm based on local cluster centers with a few labeled training examples

Knowledge-Based Systems
Question classification based on co-training style semi-supervised learning

Pattern Recognition Letters
Simple semi-supervised training of part-of-speech taggers

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Semi-supervised dependency parsing using generalized tri-training

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Combining committee-based semi-supervised learning and active learning

Journal of Computer Science and Technology
A refinement approach to handling model misfit in semi-supervised learning

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Software defect detection with rocus

Journal of Computer Science and Technology
A new co-training-style random forest for computer aided diagnosis

Journal of Intelligent Information Systems
Diverse reduct subspaces based co-training for partially labeled data

International Journal of Approximate Reasoning
Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms

Computer Methods and Programs in Biomedicine
Combining active learning and semi-supervised for improving learning performance

Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies
Sample-based software defect prediction with active and semi-supervised learning

Automated Software Engineering
Combining committee-based semi-supervised and active learning and its application to handwritten digits recognition

MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
DCPE co-training for classification

Neurocomputing
Unlabeled data and multiple views

PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
Online semi-supervised ensemble updates for fMRI data

PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
A semi-supervised feature ranking method with ensemble learning

Pattern Recognition Letters
Exploiting unlabeled data to enhance ensemble diversity

Data Mining and Knowledge Discovery
Inter-training: Exploiting unlabeled data in multi-classifier systems

Knowledge-Based Systems
Web page and image semi-supervised classification with heterogeneous information fusion

Journal of Information Science
Multi-view semi-supervised web image classification via co-graph

Neurocomputing
Effective and efficient microprocessor design space exploration using unlabeled design configurations

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Pattern classification and clustering: A review of partially supervised learning approaches

Pattern Recognition Letters
On the characterization of noise filters for self-training semi-supervised in nearest neighbor classification

Neurocomputing
A novel approach for change detection of remotely sensed images using semi-supervised multiple classifier system

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In computer-aided diagnosis (CAD), machine learning techniques have been widely applied to learn a hypothesis from diagnosed samples to assist the medical experts in making a diagnosis. To learn a well-performed hypothesis, a large amount of diagnosed samples are required. Although the samples can be easily collected from routine medical examinations, it is usually impossible for medical experts to make a diagnosis for each of the collected samples. If a hypothesis could be learned in the presence of a large amount of undiagnosed samples, the heavy burden on the medical experts could be released. In this paper, a new semisupervised learning algorithm named Co-Forest is proposed. It extends the co-training paradigm by using a well-known ensemble method named Random Forest, which enables Co-Forest to estimate the labeling confidence of undiagnosed samples and easily produce the final hypothesis. Experiments on benchmark data sets verify the effectiveness of the proposed algorithm. Case studies on three medical data sets and a successful application to microcalcification detection for breast cancer diagnosis show that undiagnosed samples are helpful in building CAD systems, and Co-Forest is able to enhance the performance of the hypothesis that is learned on only a small amount of diagnosed samples by utilizing the available undiagnosed samples.