Active dual supervision: reducing the cost of annotating examples and features

Authors:
Prem Melville;Vikas Sindhwani
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Year:
2009

Citing 22
Cited 5

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Incorporating Prior Knowledge into Boosting

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Support Vector Machine Active Learning with Application sto Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Incorporating prior knowledge with weighted margin support vector machines

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Diverse ensembles for active learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Document classification through interactive supervision of document and term labels

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
An Expected Utility Approach to Active Feature-Value Acquisition

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Constructing informative prior distributions from domain knowledge in text classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Active Learning with Feedback on Features and Instances

The Journal of Machine Learning Research
An interactive algorithm for asking and incorporating feature feedback into support vector machines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce

Proceedings of the ninth international conference on Electronic commerce
Learning from labeled features using generalized expectation criteria

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Document-Word Co-regularization for Semi-supervised Sentiment Analysis

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Active Feature-Value Acquisition

Management Science
Uncertainty sampling and transductive experimental design for active dual supervision

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sentiment analysis of blogs by combining lexical knowledge with text classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Text classification by labeling words

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Modeling annotators: a generative approach to learning from annotator rationales

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Budgeted learning of nailve-bayes classifiers

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Uncertainty sampling and transductive experimental design for active dual supervision

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A unified approach to active dual supervision for labeling features and examples

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A transfer approach to detecting disease reporting events in blog social media

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
A non-negative matrix factorization based approach for active dual supervision from document and word labels

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning on sentiment classification by selecting both words and documents

CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics

Quantified Score

Hi-index	0.00

Visualization

Abstract

When faced with the task of building machine learning or NLP models, it is often worthwhile to turn to active learning to obtain human annotations at minimal costs. Traditional active learning schemes query a human for labels of intelligently chosen examples. However, human effort can also be expended in collecting alternative forms of annotations. For example, one may attempt to learn a text classifier by labeling class-indicating words, instead of, or in addition to, documents. Learning from two different kinds of supervision brings a new, unexplored dimension to the problem of active learning. In this paper, we demonstrate the value of such active dual supervision in the context of sentiment analysis. We show how interleaving queries for both documents and words significantly reduces human effort -- more than what is possible through traditional one-dimensional active learning, or by passive combinations of supervisory inputs.