COLT '90 Proceedings of the third annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Journal of the ACM (JACM)
Using and combining predictors that specialize
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Online computation and competitive analysis
Online computation and competitive analysis
Context-sensitive learning methods for text categorization
ACM Transactions on Information Systems (TOIS)
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries
IEEE Transactions on Pattern Analysis and Machine Intelligence
Modern Information Retrieval
Managing Gigabytes: Compressing and Indexing Documents and Images
Managing Gigabytes: Compressing and Indexing Documents and Images
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
The Relaxed Online Maximum Margin Algorithm
Machine Learning
The Perceptron Algorithm with Uneven Margins
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Self-Organizing Data Structures
Developments from a June 1996 seminar on Online algorithms: the state of the art
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
A new approximate maximal margin classification algorithm
The Journal of Machine Learning Research
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
A family of additive online algorithms for category ranking
The Journal of Machine Learning Research
A classification approach to word prediction
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Large margin hierarchical classification
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs
The Journal of Machine Learning Research
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Single-pass online learning: performance, voting schemes and online feature selection
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
Deep classification in large-scale text hierarchies
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
On updates that constrain the features' connections during learning
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization
Proceedings of the 17th ACM conference on Information and knowledge management
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
NUS-WIDE: a real-world web image database from National University of Singapore
Proceedings of the ACM International Conference on Image and Video Retrieval
Efficient online learning and prediction of users' desktop actions
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Video2Text: Learning to Annotate Video Content
ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
Discovery of numerous specific topics via term co-occurrence analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Semi-supervised SimHash for efficient document similarity search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Many learning tasks, such as large-scale text categorization and word prediction, can benefit from efficient training and classification when the number of classes, in addition to instances and features, is large, that is, in the thousands and beyond. We investigate the learning of sparse class indices to address this challenge. An index is a mapping from features to classes. We compare the index-learning methods against other techniques, including one-versus-rest and top-down classification using perceptrons and support vector machines. We find that index learning is highly advantageous for space and time efficiency, at both training and classification times. Moreover, this approach yields similar and at times better accuracies. On problems with hundreds of thousands of instances and thousands of classes, the index is learned in minutes, while other methods can take hours or days. As we explain, the design of the learning update enables conveniently constraining each feature to connect to a small subset of the classes in the index. This constraint is crucial for scalability. Given an instance with l active (positive-valued) features, each feature on average connecting to d classes in the index (in the order of 10s in our experiments), update and classification take O(dl log(dl)).