DOI QR코드

DOI QR Code

프라이버시 보호 데이터 배포를 위한 모델 조사

Models for Privacy-preserving Data Publishing : A Survey

  • 김종선 (고려대학교 컴퓨터학과) ;
  • 정기정 (고려대학교 컴퓨터학과) ;
  • 이혁기 (고려대학교 컴퓨터학과) ;
  • 김수형 (고려대학교 IT융합학과) ;
  • 김종욱 (상명대학교 미디어소프트웨어학과) ;
  • 정연돈 (고려대학교 컴퓨터학과)
  • 투고 : 2016.08.17
  • 심사 : 2016.10.25
  • 발행 : 2017.02.15

초록

최근 다양한 분야에서 데이터들이 활발하게 활용되고 있다. 이에 따라 데이터의 공유나 배포를 요구하는 목소리가 높아지고 있다. 그러나 공유된 데이터에 개인과 관련된 민감한 정보가 있을 경우, 개인의 민감한 정보가 드러나는 프라이버시 유출이 발생할 수 있다. 개인 정보가 포함된 데이터를 배포하기 위해 개인의 프라이버시를 보호하면서 데이터를 최소한으로 변형하는 프라이버시 보호 데이터 배포(privacy-preserving data publishing, PPDP)가 연구되어 왔다. 프라이버시 보호 데이터 배포 연구는 다양한 공격자 모델을 가정하고 이러한 공격자의 프라이버시 유출 공격으로부터 프라이버시를 보호하기 위한 원칙인 프라이버시 모델에 따라 발전해왔다. 본 논문에서는 먼저 프라이버시 유출 공격에 대해 알아본다. 그리고 프라이버시 모델들을 프라이버시 유출 공격에 따라 분류하고 각 프라이버시 모델들 간의 차이점과 요구 조건에 대해 알아본다.

In recent years, data are actively exploited in various fields. Hence, there is a strong demand for sharing and publishing data. However, sensitive information regarding people can breach the privacy of an individual. To publish data while protecting an individual's privacy with minimal information distortion, the privacy- preserving data publishing(PPDP) has been explored. PPDP assumes various attacker models and has been developed according to privacy models which are principles to protect against privacy breaching attacks. In this paper, we first present the concept of privacy breaching attacks. Subsequently, we classify the privacy models according to the privacy breaching attacks. We further clarify the differences and requirements of each privacy model.

키워드

과제정보

연구 과제번호 : 빅데이터 환경에서 비 식별화 기법을 이용한 개인정보보호 기술 개발

연구 과제 주관 기관 : 정보통신기술진흥센터

참고문헌

  1. Narayanan A, Shmatikov V, "Robust de-anonymization of large datasets (how to break anonymity of the Netflix prize dataset), 2008," University of Texas at Austin, 2008.
  2. Sweeney L, "k-anonymity: A model for protecting privacy," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol. 10, No. 3, pp. 557-570, 2002. https://doi.org/10.1142/S0218488502001648
  3. Gehrke J, "Models and methods for privacy preserving data publishing and analysis," Proc. of the 22nd International Conference on Data Engineering (ICDE), Vol. 105, 2006.
  4. Fung B, Wang K, Chen R, Yu PS, "Privacypreserving data publishing: A survey of recent developments," ACM Computing Surveys (CSUR), Vol. 42, No. 4, pp. 14, 2010.
  5. Xu Y, Ma T, Tang M, Tian W, "A survey of privacy preserving data publishing using generalization and suppression," Applied Mathematics & Information Sciences, Vol. 8, No. 3, pp. 1103, 2014. https://doi.org/10.12785/amis/080321
  6. Gkoulalas-Divanis A, Loukides G, Sun J, "Publishing data from electronic health records while preserving privacy: a survey of algorithms," Journal of biomedical informatics, Vol. 50, pp. 4-19, 2014. https://doi.org/10.1016/j.jbi.2014.06.002
  7. LeFevre K, DeWitt DJ, Ramakrishnan R, "Incognito: Efficient full-domain k-anonymity," Proc. of the 2005 ACM SIGMOD international conference on Management of data, pp. 49-60, 2005.
  8. LeFevre K, DeWitt DJ, Ramakrishnan R, "Mondrian multidimensional k-anonymity," 22nd International Conference on Data Engineering (ICDE'06), pp. 25-25, 2006.
  9. Xiao X, Tao Y, "Anatomy: Simple and effective privacy preservation," Proc. of the 32nd international conference on Very large data bases, pp. 139-150, 2006.
  10. Li T, Li N, Zhang J, Molloy I, "Slicing: A new approach for privacy preserving data publishing," Knowledge and Data Engineering, IEEE Transactions on, Vol. 24, No. 3, pp. 561-74, 2012. https://doi.org/10.1109/TKDE.2010.236
  11. Terrovitis M, Mamoulis N, Liagouris J, Skiadopoulos S, "Privacy preservation by disassociation," Proc. of the VLDB Endowment, Vol. 5, No. 10, pp. 944-955, 2012.
  12. Dwork C, "Differential privacy: A survey of results," International Conference on Theory and Applications of Models of Computation, pp. 1-19, 2008.
  13. McSherry F, Talwar K, "Mechanism design via differential privacy," Foundations of Computer Science, 2007 FOCS'07 48th Annual IEEE Symposium on, pp. 94-103, 2007.
  14. Wang K, Fung B, "Anonymizing sequential releases," Proc. of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 414-423, 2006.
  15. Nergiz ME, Clifton C, Nergiz AE, "Multirelational k-anonymity," Knowledge and Data Engineering, IEEE Transactions on, Vol. 21, No. 8, pp. 1104-1117, 2009. https://doi.org/10.1109/TKDE.2008.210
  16. Machanavajjhala A, Kifer D, Gehrke J, Venkitasub-ramaniam M, "l-diversity: Privacy beyond kanonymity," ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, No. 1, pp. 3, 2007. https://doi.org/10.1145/1217299.1217302
  17. Wong RC-W, Li J, Fu AW-C, Wang K, "(${\alpha}$, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing," Proc. of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 754-759, 2006.
  18. Zhang Q, Koudas N, Srivastava D, Yu T, "Aggregate query answering on anonymized tables," Data Engineering, 2007 ICDE 2007 IEEE 23rd International Conference on, pp. 116-125, 2007.
  19. Li J, Tao Y, Xiao X, "Preservation of proximity privacy in publishing numerical sensitive data," Proc. of the 2008 ACM SIGMOD international conference on Management of data, pp. 473-486, 2008.
  20. Li N, Li T, Venkatasubramanian S, "t-closeness: Privacy beyond k-anonymity and l-diversity," Data Engineering, 2007 ICDE 2007 IEEE 23rd International Conference on, pp. 106-115, 2007.
  21. Rubner Y, Tomasi C, Guibas LJ, "The earth mover's distance as a metric for image retrieval," International journal of computer vision, Vol. 40, No. 2, pp. 99-121, 2000. https://doi.org/10.1023/A:1026543900054
  22. Li N, Li T, Venkatasubramanian S, "Closeness: A new privacy measure for data publishing," IEEE Transactions on Knowledge and Data Engineering, Vol. 22, No. 7, pp. 943-956, 2010. https://doi.org/10.1109/TKDE.2009.139
  23. Cao J, Karras P, Kalnis P, Tan K-L, "SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness," The VLDB Journal, Vol. 20, No. 1, pp. 59-81, 2011. https://doi.org/10.1007/s00778-010-0191-9
  24. Nergiz ME, Atzori M, Clifton C, "Hiding the presence of individuals from shared databases," Proc. of the 2007 ACM SIGMOD international conference on Management of data, pp. 665-676, 2007.
  25. Chawla S, Dwork C, McSherry F, Smith A, Wee H. "Toward privacy in public databases," Theory of Cryptography, pp. 363-385, 2005.
  26. Rastogi V, Suciu D, Hong S, "The boundary between privacy and utility in data publishing," Proc. of the 33rd international conference on Very large data bases, pp. 531-542, 2007.
  27. Li T, Li N, Zhang J, "Modeling and integrating background knowledge in data anonymization," Data Engineering, 2009 ICDE'09 IEEE 25th International Conference on, pp. 6-17, 2009.
  28. Friedman J, Hastie T, Tibshirani R, "The elements of statistical learning," Springer series in statistics Springer, 2001.
  29. Lin J, "Divergence measures based on the Shannon entropy," IEEE Transactions on Information theory, Vol. 37, No. 1, pp. 145-151, 1991. https://doi.org/10.1109/18.61115
  30. Cao J, Karras P, "Publishing microdata with a robust privacy guarantee," Proc. of the VLDB Endowment, Vol. 5, No. 11, pp. 1388-1399, 2012.
  31. Kohlmayer F, Prasser F, Eckert C, Kemper A, Kuhn KA, "Flash: efficient, stable and optimal k-anonymity," Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom), pp. 708-717, 2012.
  32. [Online]. Available: http://www.cs.utdallas.edu/dspl/cgi-bin/toolbox/index.php
  33. [Online]. Available: http://arx.deidentifier.org/anonymization-tool/

피인용 문헌

  1. Web-based k-Anonymization System in a Distributed Environment vol.20, pp.1, 2019, https://doi.org/10.9728/dcs.2019.20.1.199