Experiments with query acquisition and use in document retrieval systems
SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Computational learning theory: survey and selected bibliography
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
The potential and actual effectiveness of interactive query expansion
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
On feature distributional clustering for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Incorporating Prior Knowledge into Boosting
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Re-examining the potential effectiveness of interactive query expansion
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Active learning: theory and applications
Active learning: theory and applications
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Incorporating prior knowledge with weighted margin support vector machines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Document classification through interactive supervision of document and term labels
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Text clustering with extended user feedback
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Constructing informative prior distributions from domain knowledge in text classification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Active Learning with Feedback on Features and Instances
The Journal of Machine Learning Research
Tandem learning: a learning framework for document categorization
Tandem learning: a learning framework for document categorization
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Uncertainty sampling and transductive experimental design for active dual supervision
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Active dual supervision: reducing the cost of annotating examples and features
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Interactive feature space construction using semantic information
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Modeling annotators: a generative approach to learning from annotator rationales
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Supervised Dual-PLSA for Personalized SMS Filtering
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Active learning by labeling features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Towards subjectifying text clustering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Active learning for biomedical citation screening
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A unified approach to active dual supervision for labeling features and examples
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
End-user feature labeling: a locally-weighted regression approach
Proceedings of the 16th international conference on Intelligent user interfaces
Which clustering do you want? inducing your ideal clustering with minimal feedback
Journal of Artificial Intelligence Research
Filtering semi-structured documents based on faceted feedback
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Regroup: interactive machine learning for on-demand group creation in social networks
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
End-user interactions with intelligent and autonomous systems
CHI '12 Extended Abstracts on Human Factors in Computing Systems
Hi-index | 0.00 |
Standard machine learning techniques typically require ample training data in the form of labeled instances. In many situations it may be too tedious or costly to obtain sufficient labeled data for adequate classifier performance. However, in text classification, humans can easily guess the relevance of features, that is, words that are indicative of a topic, thereby enabling the classifier to focus its feature weights more appropriately in the absence of sufficient labeled data. We will describe an algorithm for tandem learning that begins with a couple of labeled instances, and then at each iteration recommends features and instances for a human to label. Tandem learning using an "oracle" results in much better performance than learning on only features or only instances. We find that humans can emulate the oracle to an extent that results in performance (accuracy) comparable to that of the oracle. Our unique experimental design helps factor out system error from human error, leading to a better understanding of when and why interactive feature selection works.