Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers

Authors:
Tanja Bekhuis;Dina Demner-Fushman
Affiliations:
Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA;Communications Engineering Branch, Lister Hill National Center for Biomedical Communications, US National Library of Medicine, Bethesda, MD, USA
Venue:
Artificial Intelligence in Medicine
Year:
2012

Citing 13
Cited 2

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Latent dirichlet allocation

The Journal of Machine Learning Research
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature selection for text categorization on imbalanced data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Evolutionary learning with kernels: a generic solution for large margin problems

Proceedings of the 8th annual conference on Genetic and evolutionary computation
YALE: rapid prototyping for complex data mining tasks

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
An efficient SVM-GA feature selection model for large healthcare databases

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Introduction to Information Retrieval

Introduction to Information Retrieval
Evidence-based medicine, the essential role of systematic reviews, and the need for automated text mining tools

Proceedings of the 1st ACM International Health Informatics Symposium
Exploiting the systematic review protocol for classification of medical abstracts

Artificial Intelligence in Medicine
A Family of Simple Non-Parametric Kernel Learning Algorithms

The Journal of Machine Learning Research

PICO element detection in medical text without metadata: Are first sentences enough?

Journal of Biomedical Informatics
Automatic text classification to support systematic reviews in medicine

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Objectives: To investigate whether (1) machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers; (2) classifier performance varies with optimization; and (3) the number of citations to screen can be reduced. Methods: We used an open-source, data-mining suite to process and classify biomedical citations that point to mostly nonrandomized studies from 2 systematic reviews. We built training and test sets for citation portions and compared classifier performance by considering the value of indexing, various feature sets, and optimization. We conducted our experiments in 2 phases. The design of phase I with no optimization was: 4 classifiersx3 feature setsx3 citation portions. Classifiers included k-nearest neighbor, naive Bayes, complement naive Bayes, and evolutionary support vector machine. Feature sets included bag of words, and 2- and 3-term n-grams. Citation portions included titles, titles and abstracts, and full citations with metadata. Phase II with optimization involved a subset of the classifiers, as well as features extracted from full citations, and full citations with overweighted titles. We optimized features and classifier parameters by manually setting information gain thresholds outside of a process for iterative grid optimization with 10-fold cross-validations. We independently tested models on data reserved for that purpose and statistically compared classifier performance on 2 types of feature sets. We estimated the number of citations needed to screen by reviewers during a second pass through a reduced set of citations. Results: In phase I, the evolutionary support vector machine returned the best recall for bag of words extracted from full citations; the best classifier with respect to overall performance was k-nearest neighbor. No classifier attained good enough recall for this task without optimization. In phase II, we boosted performance with optimization for evolutionary support vector machine and complement naive Bayes classifiers. Generalization performance was better for the latter in the independent tests. For evolutionary support vector machine and complement naive Bayes classifiers, the initial retrieval set was reduced by 46% and 35%, respectively. Conclusions: Machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers. Optimization can markedly improve performance of classifiers. However, generalizability varies with the classifier. The number of citations to screen during a second independent pass through the citations can be substantially reduced.