Large linear classification when data cannot fit in memory

Authors:
Hsiang-Fu Yu;Cho-Jui Hsieh;Kai-Wei Chang;Chih-Jen Lin
Affiliations:
Dept. of Computer Science, National Taiwan University, Taipei, Taiwan;Dept. of Computer Science, National Taiwan University, Taipei, Taiwan;Dept. of Computer Science, National Taiwan University, Taipei, Taiwan;Dept. of Computer Science, National Taiwan University, Taipei, Taiwan
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Year:
2011

Citing 10
Cited 0

Bagging predictors

Machine Learning
Making large-scale support vector machine learning practical

Advances in kernel methods
Compression tools compared

Linux Journal
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM

Proceedings of the 25th international conference on Machine learning
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
P-packSVM: Parallel Primal grAdient desCent Kernel SVM

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Large linear classification when data cannot fit in memory

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Linear classification is a useful tool for dealing with large-scale data in applications such as document classification and natural language processing. Recent developments of linear classification have shown that the training process can be efficiently conducted. However, when the data size exceeds the memory capacity, most training methods suffer from very slow convergence due to the severe disk swapping. Although some methods have attempted to handle such a situation, they are usually too complicated to support some important functions such as parameter selection. In this paper, we introduce a block minimization framework for data larger than memory. Under the framework, a solver splits data into blocks and stores them into separate files. Then, at each time, the solver trains a data block loaded from disk. Although the framework is simple, the experimental results show that it effectively handles a data set 20 times larger than the memory capacity.