Learning from labeled features using generalized expectation criteria

Authors:
Gregory Druck;Gideon Mann;Andrew McCallum
Affiliations:
University of Massachusetts, Amherst, MA, USA;Google, Inc., New York, NY, USA;University of Massachusetts, Amherst, MA, USA
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 15
Cited 57

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Incorporating Prior Knowledge into Boosting

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Latent dirichlet allocation

The Journal of Machine Learning Research
Incorporating prior knowledge with weighted margin support vector machines

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Document classification through interactive supervision of document and term labels

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Text clustering with extended user feedback

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Constructing informative prior distributions from domain knowledge in text classification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Prototype-driven learning for sequence models

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Active Learning with Feedback on Features and Instances

The Journal of Machine Learning Research
Simple, robust, scalable semi-supervised learning via expectation regularization

Proceedings of the 24th international conference on Machine learning
An interactive algorithm for asking and incorporating feature feedback into support vector machines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Text classification by labeling words

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A framework for incorporating class priors into discriminative classification

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Learning from measurements in exponential families

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Uncertainty sampling and transductive experimental design for active dual supervision

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sentiment analysis of blogs by combining lexical knowledge with text classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Active dual supervision: reducing the cost of annotating examples and features

HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Extracting structured information from user queries with semi-supervised conditional random fields

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Interactive feature space construction using semantic information

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Modeling annotators: a generative approach to learning from annotator rationales

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Legal docket-entry classification: where machine learning stumbles

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Interactive annotation learning with indirect feature voting

SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Latent Dirichlet Allocation with topic-in-set knowledge

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Combining labeled and unlabeled data with word-class distribution learning

Proceedings of the 18th ACM conference on Information and knowledge management
A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Semi-supervised learning of dependency parsers using generalized expectation criteria

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
New Labeling Strategy for Semi-supervised Document Categorization

KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Active learning by labeling features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Semi-supervised speech act recognition in emails and forums

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data

The Journal of Machine Learning Research
Alternating projections for learning with expectation constraints

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Active learning for biomedical citation screening

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning sentiment classification model from labeled features

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A unified approach to active dual supervision for labeling features and examples

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
End-user feature labeling: a locally-weighted regression approach

Proceedings of the 16th international conference on Intelligent user interfaces
Self-training from labeled features for sentiment analysis

Information Processing and Management: an International Journal
Latent sentiment model for weakly-supervised cross-lingual sentiment classification

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Domain adaptation for text categorization by feature labeling

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Rich prior knowledge in learning for NLP

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Filtering semi-structured documents based on faceted feedback

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Assessing benefit from feature feedback in active learning for text classification

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
High-precision phrase-based document classification on a modern scale

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Ask me better questions: active learning queries based on rule induction

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Browse by chunks: Topic mining and organizing on web-scale social media

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
Word clouds for efficient document labeling

DS'11 Proceedings of the 14th international conference on Discovery science
Holistic approaches to identifying the sentiment of blogs using opinion words

WISE'11 Proceedings of the 12th international conference on Web information system engineering
Toward interactive training and evaluation

Proceedings of the 20th ACM international conference on Information and knowledge management
Do they belong to the same class: active learning by querying pairwise label homogeneity

Proceedings of the 20th ACM international conference on Information and knowledge management
A non-negative matrix factorization based approach for active dual supervision from document and word labels

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Closing the loop: fast, interactive semi-supervised annotation with queries on features and instances

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Incorporating Sentiment Prior Knowledge for Weakly Supervised Sentiment Analysis

ACM Transactions on Asian Language Information Processing (TALIP)
Studying self- and active-training methods for multi-feature set emotion recognition

PSL'11 Proceedings of the First IAPR TC3 conference on Partially Supervised Learning
Trains of thought: generating information maps

Proceedings of the 21st international conference on World Wide Web
End-user interactions with intelligent and autonomous systems

CHI '12 Extended Abstracts on Human Factors in Computing Systems
Semi-supervised document clustering with dual supervision through seeding

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Enhancing semi-supervised document clustering with feature supervision

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Concept labeling: building text classifiers with minimal supervision

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
A unified framework for document clustering with dual supervision

ACM SIGAPP Applied Computing Review
A Bayesian modeling approach to multi-dimensional sentiment distributions prediction

Proceedings of the First International Workshop on Issues of Sentiment Discovery and Opinion Mining
Personalized document clustering with dual supervision

Proceedings of the 2012 ACM symposium on Document engineering
Behavioral factors in interactive training of text classifiers

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Exploiting partial annotations with EM training

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Combining subjective probabilities and data in training markov logic networks

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Sentiment analysis by augmenting expectation maximisation with lexical knowledge

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
"Metro maps of information" by Dafna Shahaf, Carlos Guestrin and Eric Horvitz, with Ching-man Au Yeung as coordinator

ACM SIGWEB Newsletter
An evaluation of learning analytics to identify exploratory dialogue in online discussions

Proceedings of the Third International Conference on Learning Analytics and Knowledge
Revised mutual information approach for german text sentiment classification

Proceedings of the 22nd international conference on World Wide Web companion
Researcher homepage classification using unlabeled data

Proceedings of the 22nd international conference on World Wide Web
Interactive text document clustering using feature labeling

Proceedings of the 2013 ACM symposium on Document engineering
End-user feature labeling: Supervised and semi-supervised approaches based on locally-weighted logistic regression

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is difficult to apply machine learning to new domains because often we lack labeled problem instances. In this paper, we provide a solution to this problem that leverages domain knowledge in the form of affinities between input features and classes. For example, in a baseball vs. hockey text classification problem, even without any labeled data, we know that the presence of the word puck is a strong indicator of hockey. We refer to this type of domain knowledge as a labeled feature. In this paper, we propose a method for training discriminative probabilistic models with labeled features and unlabeled instances. Unlike previous approaches that use labeled features to create labeled pseudo-instances, we use labeled features directly to constrain the model's predictions on unlabeled instances. We express these soft constraints using generalized expectation (GE) criteria --- terms in a parameter estimation objective function that express preferences on values of a model expectation. In this paper we train multinomial logistic regression models using GE criteria, but the method we develop is applicable to other discriminative probabilistic models. The complete objective function also includes a Gaussian prior on parameters, which encourages generalization by spreading parameter weight to unlabeled features. Experimental results on text classification data sets show that this method outperforms heuristic approaches to training classifiers with labeled features. Experiments with human annotators show that it is more beneficial to spend limited annotation time labeling features rather than labeling instances. For example, after only one minute of labeling features, we can achieve 80% accuracy on the ibm vs. mac text classification problem using GE-FL, whereas ten minutes labeling documents results in an accuracy of only 77%