DOI QR코드

DOI QR Code

Review on statistical methods for protecting privacy and measuring risk of disclosure when releasing information for public use

정보공개 환경에서 개인정보 보호와 노출 위험의 측정에 대한 통계적 방법

  • Lee, Yonghee (Department of Statistics, University of Seoul)
  • 이용희 (서울시립대학교 통계학과)
  • Received : 2013.07.03
  • Accepted : 2013.09.04
  • Published : 2013.09.30

Abstract

Recently, along with emergence of big data, there are incresing demands for releasing information and micro data for public use so that protecting privacy and measuring risk of disclosure for released database become important issues in goverment and business sector as well as academic community. This paper reviews statistical methods for protecting privacy and measuring risk of disclosure when micro data or data analysis sever is released for public use.

최근 빅데이터의 등장과 정보 공개에 대한 급격한 수요 증가에 따라 자료를 일반에게 공개할 때 개인 정보를 보호해야 하는 필요성이 어느 때보다 절실하다. 본 논문에서는 마이크로 자료와 통계분석 서버를 중심으로 현재까지 제시된 개인정보 노출제한를 위한 통계적 방법, 정보 노출의 개념, 노출 위험을 측정하는 기준들을 개괄적으로 소개한다.

Keywords

References

  1. Dalenius, T. (1977). Towards a methodology for statistical disclosure control. Statistik Tidskrift, 15, 429-444.
  2. Dalenius, T. (1986). Finding a needle in a haystack or identifying anonymous census record. Journal of Official Statistics, 2, 329-336.
  3. Dalenius, T. and Reiss, S.P. (1978). Data-swapping: A technique for disclosure control. Preceedings of the Section on Survey Research Methods, American Statistical Association, Washington DC, USA, 191-194.
  4. Dalenius, T. and Reiss, S.P. (1982). Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference, 6, 73-85. https://doi.org/10.1016/0378-3758(82)90058-1
  5. Duncan, G. and Lambert, D. (1989). The risk of disclosure for microdata. Journal of Business and Economic Statistics, 7, 207-217.
  6. Duncan, G. T., Elliot, M. and Slazar-Gonzalez, J. (2011). Statistical confidentiality principles and practice statistics for social and behavioral sciences, Springer, New York, NY, USA.
  7. Dwork, C. (2006). Differential privacy. In 33rd International Colloquium on Automata, Languages and Programming, Part II (ICALP 2006), Springer, Venice, Italy, 1-12.
  8. Fienberg, S. E. and McIntyre, J. (2004). Data swapping: Variations on a theme by Dalenius and Reiss. In PSD 2004, Lecture Notes on Computer Science, edited by J. Domingo-Ferrer and V. Torra, Springer, New York, NY, USA, 14-29.
  9. Gomatam, S., Karr, A. F., Reiter, J. P.and Sanil, A. P. (2008). Data dissemination and disclosure limitation in a world without microdata: A risk-utility framework for remote access analysis servers. Statistical Science, 20, 163-177.
  10. Li, N., Li, T. and Venkatasubramanian, S. (2007). t-closeness: Privacy beyond k-anonymity and l-diversity. In ICDE 2007, IEEE 23rd International Conference on Data Engineering, 106-115.
  11. Machanavajjhala, A., Kifer, D., Gehrke, J. and Venkitasubramaniam, M. (2007). L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1, 3-20. https://doi.org/10.1145/1217299.1217302
  12. Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N. (1991). The case for samples of anonymized records from the 1991 census. Journal of the Royal Statistical Society A, 154, 305-340. https://doi.org/10.2307/2983043
  13. Matthews. G. J. and Harel, O. (2011). Data confidentiality: A review of methods for statistical disclosure limitation and methods for assessing privacy. Statistics Surveys, 5, 1-29. https://doi.org/10.1214/11-SS074
  14. Narayanan, A. and Shmatikov, V. (2007). How to break anonymity of the Netflix Prize dataset, preprint in http://arxiv.org/.
  15. Nissim, K., Raskhodnikova, S. and Smith, A. (2007). Smooth sensitivity and sampling in private data analysis. In STOC 07, Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, ACM, New York, NY, USA, 75-84.
  16. Paass, G. (1988). Disclosure risk and disclosure avoidance for microdata. Journal of Business and Economic Statistics, 6, 487-500.
  17. Reiss, S. P. (1984). Practical data-swapping: The first step. ACM Transactions on database systems, 9, 20-37. https://doi.org/10.1145/348.349
  18. Reiter, J.P. (2005). Estimating risks of identification disclosure in microdata. Journal of the American Statistical Association, 100, 1103-1112. https://doi.org/10.1198/016214505000000619
  19. Skinner, C. J., Marsh, C., Openshaw, S. and Wymer, C. (1994). Disclosure control for census microdata. Journal of Official Statistics, 10, 31-51.
  20. Skinner, C. J. and Elliot, M.J. (2002). A measure of disclosure risk for microdata. Journal of the Royal Statistical Society B, 64, 855-867. https://doi.org/10.1111/1467-9868.00365
  21. Skinner, C. and Shlomo, N. (2008). Assessing identification risk in survey micro-data using log-linear models. Journal of the American Statistical Association, 103, 989-1001. https://doi.org/10.1198/016214507000001328
  22. Sweeney, L. (2002). k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 10, 557-570. https://doi.org/10.1142/S0218488502001648

Cited by

  1. Statistical disclosure control for public microdata: present and future vol.29, pp.6, 2016, https://doi.org/10.5351/KJAS.2016.29.6.1041