Collective Principal Component Analysis from Distributed, Heterogeneous Data

Authors:
Hillol Kargupta;Weiyun Huang;Krishnamoorthy Sivakumar;Byung-Hoon Park;Shuren Wang
Affiliations:
-;-;-;-;-
Venue:
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Year:
2000

Citing 3
Cited 3

Quantifiable data mining using principal component analysis

Quantifiable data mining using principal component analysis
LAPACK Users' guide (third ed.)

LAPACK Users' guide (third ed.)
Principal Direction Divisive Partitioning

Data Mining and Knowledge Discovery

Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction

ACM Transactions on Knowledge Discovery from Data (TKDD)
Distributed knowledge discovery with non linear dimensionality reduction

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
A Sequential Sampling Framework for Spectral k-Means Based on Efficient Bootstrap Accuracy Estimations: Application to Distributed Clustering

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Principal component analysis (PCA) is a statistical technique to identify the dependency structure of multivariate stochastic observations. PCA is frequently used in data mining applications. This paper considers PCA in the context of the emerging network-based computing environments. It offers a technique to perform PCA from distributed and heterogeneous data sets with relatively small communication overhead. The technique is evaluated against different data sets, including a data set for a web mining application. This approach is likely to facilitate the development of distributed clustering, associative link analysis, and other heterogeneous data mining applications that frequently use PCA.