Exploring label dependency in active learning for phenotype mapping

Authors:
Shefali Sharma;Leslie Lange;Jose Luis Ambite;Yigal Arens;Chun-Nan Hsu
Affiliations:
University of Southern California, Marina del Rey, CA;University of North Carolina, Chaple Hills, NC;University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA and Institute of Information Sciences, Academia Sinica, Taipei, Taiwan
Venue:
BioNLP '12 Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Year:
2012

Citing 9
Cited 0

Query by committee

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Selective sampling for example-based word sense disambiguation

Computational Linguistics
Evaluation of active learning strategies for video indexing

Image Communication
Hierarchical sampling for active learning

Proceedings of the 25th international conference on Machine learning
An analysis of active learning strategies for sequence labeling tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Improving learning in networked data by combining explicit and mined links

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Active learning with multiple views

Journal of Artificial Intelligence Research
PheWAS

Bioinformatics
Learning phenotype mapping for integrating large genetic data

BioNLP '11 Proceedings of BioNLP 2011 Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many genetic epidemiological studies of human diseases have multiple variables related to any given phenotype, resulting from different definitions and multiple measurements or subsets of data. Manually mapping and harmonizing these phenotypes is a time-consuming process that may still miss the most appropriate variables. Previously, a supervised learning algorithm was proposed for this problem. That algorithm learns to determine whether a pair of phenotypes is in the same class. Though that algorithm accomplished satisfying F-scores, the need to manually label training examples becomes a bottleneck to improve its coverage. Herein we present a novel active learning solution to solve this challenging phenotype-mapping problem. Active learning will make phenotype mapping more efficient and improve its accuracy.