- Secure Multiparty Computation of Principal Component Analysis
- Kim, Sang-Pil ; Lee, Sanghun ; Gil, Myeong-Seon ; Moon, Yang-Sae ; Won, Hee-Sun ;
- Journal of KIISE, volume 42, issue 7, 2015, Pages 919~928
- DOI : 10.5626/JOK.2015.42.7.919
Abstract
In recent years, many research efforts have been made on privacy-preserving data mining (PPDM) in data of large volume. In this paper, we propose a PPDM solution based on principal component analysis (PCA), which can be widely used in computing correlation among sensitive data sets. The general method of computing PCA is to collect all the data spread in multiple nodes into a single node before starting the PCA computation; however, this approach discloses sensitive data of individual nodes, involves a large amount of computation, and incurs large communication overheads. To solve the problem, in this paper, we present an efficient method that securely computes PCA without the need to collect all the data. The proposed method shares only limited information among individual nodes, but obtains the same result as that of the original PCA. In addition, we present a dimensionality reduction technique for the proposed method and use it to improve the performance of secure similar document detection. Finally, through various experiments, we show that the proposed method effectively and efficiently works in a large amount of multi-dimensional data.