Collective principal component analysis from distributed, heterogeneous data

H Kargupta, W Huang, K Sivakumar, BH Park… - Principles of Data …, 2000 - Springer
H Kargupta, W Huang, K Sivakumar, BH Park, S Wang
Principles of Data Mining and Knowledge Discovery: 4th European Conference …, 2000Springer
Principal component analysis (PCA) is a statistical technique to identify the dependency
structure of multivariate stochastic observations. PCA is frequently used in data mining
applications. This paper considers PCA in the context of the emerging network-based
computing environments. It offers a technique to perform PCA from distributed and
heterogeneous data sets with relatively small communication overhead. The technique is
evaluated against different data sets, including a data set for a web mining application. This …
Abstract
Principal component analysis (PCA) is a statistical technique to identify the dependency structure of multivariate stochastic observations. PCA is frequently used in data mining applications. This paper considers PCA in the context of the emerging network-based computing environments. It offers a technique to perform PCA from distributed and heterogeneous data sets with relatively small communication overhead. The technique is evaluated against different data sets, including a data set for a web mining application. This approach is likely to facilitate the development of distributed clustering, associative link analysis, and other heterogeneous data mining applications that frequently use PCA.
Springer
以上显示的是最相近的搜索结果。 查看全部搜索结果