• Title/Summary/Keyword: de-identification

Search Result 367, Processing Time 0.035 seconds

De-identification Techniques for Big Data and Issues (빅데이타 비식별화 기술과 이슈)

  • Woo, SungHee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.750-753
    • /
    • 2017
  • Recently, the processing and utilization of big data, which is generated by the spread of smartphone, SNS, and the internet of things, is emerging as a new growth engine of ICT field. However, in order to utilize such big data, De-identification of personal information should be done. De-identification removes identifying information from a data set so that individual data cannot be linked with specific individuals. De-identification can reduce the privacy risk associated with collecting, processing, archiving, distributing or publishing information, thus it attempts to balance the contradictory goals of using and sharing personal information while protecting privacy. De-identified information has also been re-identified and has been controversial for the protection of personal information, but the number of instances where personal information such as big data is de-identified and processed is increasing. In addition, many de-identification guidelines have been introduced and a method for de-identification of personal information has been proposed. Therefore, in this study, we describe the big data de-identification process and follow-up management, and then compare and analyze de-identification methods. Finally we provide personal information protection issues and solutions.

  • PDF

De-identification of Medical Information and Issues (의료정보 비식별화와 해결과제)

  • Woo, SungHee
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.552-555
    • /
    • 2017
  • It is de-identification that emerged to find the trade-off between the use of big data and the protection of personal information. In particular, in the field of medical that deals with various semi-identifier information and sensitive information, de-identification must be performed in order to use medical consultation such as EMR and voice, KakaoTalk, and SNS. However, there is no separate law for medical information protection and legislation for de-identification. Therefore, in this study, we present the current status of de-identification of personal information, the status and case of de-identification of medical information, and finally we provide issues and solutions for medial information protection and de-identification.

  • PDF

A Study on De-Identification of Metering Data for Smart Grid Personal Security in Cloud Environment

  • Lee, Donghyeok;Park, Namje
    • Journal of Multimedia Information System
    • /
    • v.4 no.4
    • /
    • pp.263-270
    • /
    • 2017
  • Various security threats exist in the smart grid environment due to the fact that information and communication technology are grafted onto an existing power grid. In particular, smart metering data exposes a variety of information such as users' life patterns and devices in use, and thereby serious infringement on personal information may occur. Therefore, we are in a situation where a de-identification algorithm suitable for metering data is required. Hence, this paper proposes a new de-identification method for metering data. The proposed method processes time information and numerical information as de-identification data, respectively, so that pattern information cannot be analyzed by the data. In addition, such a method has an advantage that a query such as a direct range search and aggregation processing in a database can be performed even in a de-identified state for statistical processing and availability.

A Study on the de-identification of Personal Information of Hotel Users (호텔 이용 고객의 개인정보 비식별화 방안에 관한 연구)

  • Kim, Taekyung
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.4
    • /
    • pp.51-58
    • /
    • 2016
  • In the area of hotel and tourism sector, various research are analyzed using big data. Big data is being generated by any digital devices around us all the times. All the digital process and social media exchange produces the big data. In this paper, we analyzed the de-identification method of big data to use the personal information of hotel guests. Through the analysis of these big data, hotel can provide differentiated and diverse services to hotel guests and can improve the service and support the marketing of hotels. If the hotel wants to use the information of the guest, the private data should be de-identified. There are several de-identification methods of personal information such as pseudonymisation, aggregation, data reduction, data suppression and data masking. Using the comparison of these methods, the pseudonymisation is discriminated to the suitable methods for the analysis of information for the hotel guest. Also, among the pseudonymisation methods, the t-closeness was analyzed to the secure and efficient method for the de-identification of personal information in hotel.

Meso-scale based parameter identification for 3D concrete plasticity model

  • Suljevic, Samir;Ibrahimbegovic, Adnan;Karavelic, Emir;Dolarevic, Samir
    • Coupled systems mechanics
    • /
    • v.11 no.1
    • /
    • pp.55-78
    • /
    • 2022
  • The main aim of this paper is the identification of the model parameters for the constitutive model of concrete and concrete-like materials capable of representing full set of 3D failure mechanisms under various stress states. Identification procedure is performed taking into account multi-scale character of concrete as a structural material. In that sense, macro-scale model is used as a model on which the identification procedure is based, while multi-scale model which assume strong coupling between coarse and fine scale is used for numerical simulation of experimental results. Since concrete possess a few clearly distinguished phases in process of deformation until failure, macro-scale model contains practically all important ingredients to include both bulk dissipation and surface dissipation. On the other side, multi-scale model consisted of an assembly micro-scale elements perfectly fitted into macro-scale elements domain describes localized failure through the implementation of embedded strong discontinuity. This corresponds to surface dissipation in macro-scale model which is described by practically the same approach. Identification procedure is divided into three completely separate stages to utilize the fact that all material parameters of macro-scale model have clear physical interpretation. In this way, computational cost is significantly reduced as solving three simpler identification steps in a batch form is much more efficient than the dealing with the full-scale problem. Since complexity of identification procedure primarily depends on the choice of either experimental or numerical setup, several numerical examples capable of representing both homogeneous and heterogeneous stress state are performed to illustrate performance of the proposed methodology.

Analysis of k Value from k-anonymity Model Based on Re-identification Time (재식별 시간에 기반한 k-익명성 프라이버시 모델에서의 k값에 대한 연구)

  • Kim, Chaewoon;Oh, Junhyoung;Lee, Kyungho
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.43-52
    • /
    • 2020
  • With the development of data technology, storing and sharing of data has increased, resulting in privacy invasion. Although de-identification technology has been introduced to solve this problem, it has been proved many times that identifying individuals using de-identified data is possible. Even if it cannot be completely safe, sufficient de-identification is necessary. But current laws and regulations do not quantitatively specify the degree of how much de-identification should be performed. In this paper, we propose an appropriate de-identification criterion considering the time required for re-identification. We focused on the case of using the k-anonymity model among various privacy models. We analyzed the time taken to re-identify data according to the change in the k value. We used a re-identification method based on linkability. As a result of the analysis, we determined which k value is appropriate. If the generalized model can be developed by results of this paper, the model can be used to define the appropriate level of de-identification in various laws and regulations.

Identification of Volterra Kernels of Nonlinear Van do Vusse Reactor

  • Kashiwagi, Hiroshi;Rong, Li
    • Transactions on Control, Automation and Systems Engineering
    • /
    • v.4 no.2
    • /
    • pp.109-113
    • /
    • 2002
  • Van de Vusse reactor is known as a highly nonlinear chemical process and has been considered by a number of researchers as a benchmark problem for nonlinear chemical process. Various identification methods for nonlinear system are also verified by applying these methods to Van de Vusse reactor. From the point of view of identification, only the Volterra kernel of second order has been obtained until now. In this paper, the authors show that Volterra kernels of nonlinear Van de Vusse reactor of up to 3rd order are obtained by use of M-sequence correlation method. A pseudo-random M-sequence is applied to Van de Vusse reactor as an input and its output is measured. Taking the crosscorrelation function between the input and the output, we obtain up to 3rd order Volterra kernels, which is the highest order Volterra kernel obtained until now for Van de Vusse reactor. Computer simulations show that when Van de Vusse chemical process is identified by use of up to 3rd order Volterra kernels, a good agreement is observed between the calculated output and the actual output.

Research on the development of automated tools to de-identify personal information of data for AI learning - Based on video data - (인공지능 학습용 데이터의 개인정보 비식별화 자동화 도구 개발 연구 - 영상데이터기반 -)

  • Hyunju Lee;Seungyeob Lee;Byunghoon Jeon
    • Journal of Platform Technology
    • /
    • v.11 no.3
    • /
    • pp.56-67
    • /
    • 2023
  • Recently, de-identification of personal information, which has been a long-cherished desire of the data-based industry, was revised and specified in August 2020. It became the foundation for activating data called crude oil[2] in the fourth industrial era in the industrial field. However, some people are concerned about the infringement of the basic rights of the data subject[3]. Accordingly, a development study was conducted on the Batch De-Identification Tool, a personal information de-identification automation tool. In this study, first, we developed an image labeling tool to label human faces (eyes, nose, mouth) and car license plates of various resolutions to build data for training. Second, an object recognition model was trained to run the object recognition module to perform de-identification of personal information. The automated personal information de-identification tool developed as a result of this research shows the possibility of proactively eliminating privacy violations through online services. These results suggest possibilities for data-based industries to maximize the value of data while balancing privacy and utilization.

  • PDF

Comparison of Korean Speech De-identification Performance of Speech De-identification Model and Broadcast Voice Modulation (음성 비식별화 모델과 방송 음성 변조의 한국어 음성 비식별화 성능 비교)

  • Seung Min Kim;Dae Eol Park;Dae Seon Choi
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.56-65
    • /
    • 2023
  • In broadcasts such as news and coverage programs, voice is modulated to protect the identity of the informant. Adjusting the pitch is commonly used voice modulation method, which allows easy voice restoration to the original voice by adjusting the pitch. Therefore, since broadcast voice modulation methods cannot properly protect the identity of the speaker and are vulnerable to security, a new voice modulation method is needed to replace them. In this paper, using the Lightweight speech de-identification model as the evaluation target model, we compare speech de-identification performance with broadcast voice modulation method using pitch modulation. Among the six modulation methods in the Lightweight speech de-identification model, we experimented on the de-identification performance of Korean speech as a human test and EER(Equal Error Rate) test compared with broadcast voice modulation using three modulation methods: McAdams, Resampling, and Vocal Tract Length Normalization(VTLN). Experimental results show VTLN modulation methods performed higher de-identification performance in both human tests and EER tests. As a result, the modulation methods of the Lightweight model for Korean speech has sufficient de-identification performance and will be able to replace the security-weak broadcast voice modulation.

A study on the method of measuring the usefulness of De-Identified Information using Personal Information

  • Kim, Dong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.6
    • /
    • pp.11-21
    • /
    • 2022
  • Although interest in de-identification measures for the safe use of personal information is growing at home and abroad, cases where de-identified information is re-identified through insufficient de-identification measures and inferences are occurring. In order to compensate for these problems and discover new technologies for de-identification measures, competitions to compete on the safety and usefulness of de-identified information are being held in Korea and Japan. This paper analyzes the safety and usefulness indicators used in these competitions, and proposes and verifies new indicators that can measure usefulness more efficiently. Although it was not possible to verify through a large population due to a significant shortage of experts in the fields of mathematics and statistics in the field of de-identification processing, very positive results could be derived for the necessity and validity of new indicators. In order to safely utilize the vast amount of public data in Korea as de-identified information, research on these usefulness metrics should be continuously conducted, and it is expected that more active research will proceed starting with this thesis.