An incremental subspace learning algorithm to categorize large scale text data

Authors:
Jun Yan;Qiansheng Cheng;Qiang Yang;Benyu Zhang
Affiliations:
LMAM, Department of Information Science, School of Mathematical Sciences, Peking University, Beijing, P.R. China;LMAM, Department of Information Science, School of Mathematical Sciences, Peking University, Beijing, P.R. China;Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong;Microsoft Research Asia, Beijing, P.R. China
Venue:
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Year:
2005

Citing 10
Cited 1

Using IR techniques for text classification in document analysis

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Visual learning and recognition of 3-D objects from appearance

International Journal of Computer Vision
Probabilistic Visual Learning for Object Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
PCA versus LDA

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Incremental PCA or On-Line Visual Learning and Recognition

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Candid Covariance-Free Incremental Principal Component Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
On self-organizing algorithms and networks for class-separability features

IEEE Transactions on Neural Networks

Dynamic topography information landscapes: an incremental approach to visual knowledge discovery

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

The dramatic growth in the number and size of on-line information sources has fueled increasing research interest in the incremental subspace learning problem. In this paper, we propose an incremental supervised subspace learning algorithm, called Incremental Inter-class Scatter (IIS) algorithm. Unlike traditional batch learners, IIS learns from a stream of training data, not a set. IIS overcomes the inherent problem of some other incremental operations such as Incremental Principal Component Analysis (PCA) and Incremental Linear Discriminant Analysis (LDA). The experimental results on the synthetic datasets show that IIS performs as well as LDA and is more robust against noise. In addition, the experiments on the Reuters Corpus Volume 1 (RCV1) dataset show that IIS outperforms state-of-the-art Incremental Principal Component Analysis (IPCA) algorithm, a related algorithm, and Information Gain in efficiency and effectiveness respectively.