Automation System for Sharing CDM Data

CDM 데이터 공유를 위한 자동화 시스템

  • Received : 2020.08.19
  • Accepted : 2020.10.04
  • Published : 2020.10.05

Abstract

As the need for sharing for research purposes in the medical field increases, the use of a Common Data Model (CDM) is increasing. However, when sharing CDM data, there are some problems in that access control and personal information in the data are not protected. In this paper, in order to solve this problem, access to CDM data is controlled by using an encryption method in a blockchain network, and information of CDM data is recorded to enable tracking. In addition, IPFS was used to share a large amount of CDM data, and Celery was used to automate the sharing process. In other words, we propose a multi-channel automation system in which the information required for CDM data sharing is shared by a trust-based technology, a distributed file system, and a message queue for automation. This aims to solve the problem of access control and personal information protection in the data that occur in the process of sharing CDM data.

의료 분야에서 연구 목적을 위해 공유에 대한 필요성이 증가함에 따라 공통 데이터 모델(CDM)의 활용이 증가하고 있다. 하지만 CDM 데이터를 공유할 때 접근 제어와 데이터 내에 있는 개인 정보 보호가 되지 않는 문제들이 존재한다. 본 논문에서는 이러한 문제를 해결하기 위해 블록체인 네트워크에 암호화 방식을 사용하여 CDM 데이터에 대한 접근 제어를 하고, CDM 데이터의 정보를 기록하여 추적이 가능하게 했다. 또한 대용량의 CDM 데이터를 공유하기 위해 IPFS를 이용하였으며, 공유하는 과정을 자동화하기 위해 Celery를 활용하였다. 즉, CDM 데이터 공유에 필요한 정보를 신뢰 기반 기술, 분산 파일 시스템 그리고 자동화를 위한 메시지 큐가 나누어 가진 멀티 채널 자동화 시스템을 제안한다. 이를 통해 CDM 데이터를 공유하는 과정에서 발생하는 접근 제어와 데이터 내에 있는 개인 정보 보호 문제를 해결하고자 한다.

Keywords

Acknowledgement

본 연구는 과학기술정보통신부 및 정보통신기획평가원의 대학 ICT 육성지원사업의 연구결과로 수행되었음 (IITP-2020-2017-0-01628)

References

  1. Curtis, Lesley H., Jeffrey Brown, and Richard Platt. "Four health data networks illustrate the potential for a shared national multipurpose big-data network." Health affairs 33.7 (2014): 1178-1186. https://doi.org/10.1377/hlthaff.2014.0121
  2. Popovic, J. R. "Distributed data networks: a paradigm shift in data sharing and healthcare analytics." Proceedings of the 2015 Pharmaceutical Industry SAS Users Group Conference. 2015.
  3. Clifton, Chris, et al. "Privacy-preserving data integration and sharing." Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. 2004.
  4. Howard, John H., et al. "Scale and performance in a distributed file system." ACM Transactions on Computer Systems (TOCS) 6.1 (1988): 51-81. https://doi.org/10.1145/35037.35059
  5. Wang, Shangping, Yinglong Zhang, and Yaling Zhang. "A blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems." Ieee Access 6 (2018): 38437-38450. https://doi.org/10.1109/ACCESS.2018.2851611
  6. Celery. n.d. "Celery: Distributed Task Queue." http://www.celeryproject.org.
  7. Satyanarayanan, Mahadev. "Scalable, secure, and highly available distributed file access." Computer 23.5 (1990): 9-18. https://doi.org/10.1109/2.53351
  8. Borthakur, Dhruba. "The hadoop distributed file system: Architecture and design." Hadoop Project Website 11.2007 (2007): 21.
  9. Weil, Sage A., et al. "Ceph: A scalable, high-performance distributed file system." Proceedings of the 7th symposium on Operating systems design and implementation. 2006.
  10. OwFS, https://www.owfs.org.
  11. Benet, Juan. "Ipfs-content addressed, versioned, p2p file system." arXiv preprint arXiv:1407.3561 (2014).
  12. Zheng, Qiuhong, et al. "An innovative IPFS-based storage model for blockchain." 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, 2018.
  13. Nakamoto, Satoshi. Bitcoin: A peer-to-peer electronic cash system. Manubot, 2019.
  14. Steichen, Mathis, et al. "Blockchain-based, decentralized access control for IPFS." 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). IEEE, 2018.