A unified approach to active dual supervision for labeling features and examples

Authors:
Josh Attenberg;Prem Melville;Foster Provost
Affiliations:
Polytechnic Institute of NYU, Brooklyn, NY;IBM Research, Yorktown Heights, NY;NYU Stern School of Business, New York, NY
Venue:
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Year:
2010

Citing 23
Cited 11

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Incorporating Prior Knowledge into Boosting

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Incorporating prior knowledge with weighted margin support vector machines

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Diverse ensembles for active learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Document classification through interactive supervision of document and term labels

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
An Expected Utility Approach to Active Feature-Value Acquisition

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Constructing informative prior distributions from domain knowledge in text classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Active Learning with Feedback on Features and Instances

The Journal of Machine Learning Research
An interactive algorithm for asking and incorporating feature feedback into support vector machines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Learning from labeled features using generalized expectation criteria

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Document-Word Co-regularization for Semi-supervised Sentiment Analysis

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Active Feature-Value Acquisition

Management Science
Learning from measurements in exponential families

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Uncertainty sampling and transductive experimental design for active dual supervision

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sentiment analysis of blogs by combining lexical knowledge with text classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Active dual supervision: reducing the cost of annotating examples and features

HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Text classification by labeling words

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Modeling annotators: a generative approach to learning from annotator rationales

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning by labeling features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Budgeted learning of nailve-bayes classifiers

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

End-user feature labeling: a locally-weighted regression approach

Proceedings of the 16th international conference on Intelligent user interfaces
Inactive learning?: difficulties employing active learning in practice

ACM SIGKDD Explorations Newsletter
Deploying an interactive machine learning system in an evidence-based practice center: abstrackr

Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
A non-negative matrix factorization based approach for active dual supervision from document and word labels

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Closing the loop: fast, interactive semi-supervised annotation with queries on features and instances

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Semi-supervised document clustering with dual supervision through seeding

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Enhancing semi-supervised document clustering with feature supervision

Proceedings of the 27th Annual ACM Symposium on Applied Computing
A unified framework for document clustering with dual supervision

ACM SIGAPP Applied Computing Review
Behavioral factors in interactive training of text classifiers

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Interactive text document clustering using feature labeling

Proceedings of the 2013 ACM symposium on Document engineering
End-user feature labeling: Supervised and semi-supervised approaches based on locally-weighted logistic regression

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

When faced with the task of building accurate classifiers, active learning is often a beneficial tool for minimizing the requisite costs of human annotation. Traditional active learning schemes query a human for labels on intelligently chosen examples. However, human effort can also be expended in collecting alternative forms of annotation. For example, one may attempt to learn a text classifier by labeling words associated with a class, instead of, or in addition to, documents. Learning from two different kinds of supervision adds a challenging dimension to the problem of active learning. In this paper, we present a unified approach to such active dual supervision: determining which feature or example a classifier is most likely to benefit from having labeled. Empirical results confirm that appropriately querying for both example and feature labels significantly reduces overall human effort--beyond what is possible through traditional one-dimensional active learning.