Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sublinear time approximate clustering
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Incremental PCA or On-Line Visual Learning and Recognition
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Candid Covariance-Free Incremental Principal Component Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
IMMC: incremental maximum margin criterion
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Sufficient dimensionality reduction with irrelevance statistics
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Information Sciences: an International Journal
Network intrusion detection: Evaluating cluster, discriminant, and logit analysis
Information Sciences: an International Journal
Fractal dimension applied to plant identification
Information Sciences: an International Journal
Mining frequent itemsets over data streams using efficient window sliding techniques
Expert Systems with Applications: An International Journal
Supervised subspace projections for constructing ensembles of classifiers
Information Sciences: an International Journal
Incremental learning of complete linear discriminant analysis for face recognition
Knowledge-Based Systems
Proceedings of the Second International Conference on Innovative Computing and Cloud Computing
Hi-index | 0.07 |
Algorithms on streaming data have attracted increasing attention in the past decade. Among them, dimensionality reduction algorithms are greatly interesting due to the desirability of real tasks. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are two of the most widely used dimensionality reduction approaches. However, PCA is not optimal for general classification problems because it is unsupervised and ignores valuable label information for classification. On the other hand, the performance of LDA is degraded when encountering limited available low-dimensional spaces and singularity problem. Recently, Maximum Margin Criterion (MMC) was proposed to overcome the shortcomings of PCA and LDA. Nevertheless, the original MMC algorithm could not satisfy the streaming data model to handle large-scale high-dimensional data set. Thus an effective, efficient and scalable approach is needed. In this paper, we propose a supervised incremental dimensionality reduction algorithm and its extension to infer adaptive low-dimensional spaces by optimizing the maximum margin criterion. Experimental results on a synthetic dataset and real datasets demonstrate the superior performance of our proposed algorithm on streaming data.