• Title/Summary/Keyword: De-identification Algorithms

Search Result 12, Processing Time 0.027 seconds

A non-destructive method for elliptical cracks identification in shafts based on wave propagation signals and genetic algorithms

  • Munoz-Abella, Belen;Rubio, Lourdes;Rubio, Patricia
    • Smart Structures and Systems
    • /
    • v.10 no.1
    • /
    • pp.47-65
    • /
    • 2012
  • The presence of crack-like defects in mechanical and structural elements produces failures during their service life that in some cases can be catastrophic. So, the early detection of the fatigue cracks is particularly important because they grow rapidly, with a propagation velocity that increases exponentially, and may lead to long out-of-service periods, heavy damages of machines and severe economic consequences. In this work, a non-destructive method for the detection and identification of elliptical cracks in shafts based on stress wave propagation is proposed. The propagation of a stress wave in a cracked shaft has been numerically analyzed and numerical results have been used to detect and identify the crack through the genetic algorithm optimization method. The results obtained in this work allow the development of an on-line method for damage detection and identification for cracked shaft-like components using an easy and portable dynamic testing device.

The De-identification Technique Using Data Grouping in Relational Database (관계형 데이터베이스에서 데이터 그룹화를 이용한 익명화 처리 기법)

  • Park, Jun-Bum;Jin, Seung-Hun;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.3
    • /
    • pp.493-500
    • /
    • 2015
  • Personal information exposed in the Internet is increasing by the public data opening and sharing, vitalization of SNS(Social Network Service) and growth of information shared between users. Exposed personal information in the Internet can infringe upon targeted users using linkage attack or background attack. To prevent these attack De-identification models were appeared a few years ago. The 'k-anonymity' has been introduced in the first place, and the '${\ell}$-diversity' and 't-closeness' have been followed up as solutions, and diverse algorithms have been being suggested for performance improvement nowadays. However, industry or public sectors actually needs a whole solution as a system for the de-identification process rather than performance of the de-identification algorithm. This paper explains a way of de-identification techique for 'k-anonymity', '${\ell}$-diversity', and 't-closeness' algorithm using QI(Quasi-Identifier) grouping method in the relational database.

A pilot study of an automated personal identification process: Applying machine learning to panoramic radiographs

  • Ortiz, Adrielly Garcia;Soares, Gustavo Hermes;da Rosa, Gabriela Cauduro;Biazevic, Maria Gabriela Haye;Michel-Crosato, Edgard
    • Imaging Science in Dentistry
    • /
    • v.51 no.2
    • /
    • pp.187-193
    • /
    • 2021
  • Purpose: This study aimed to assess the usefulness of machine learning and automation techniques to match pairs of panoramic radiographs for personal identification. Materials and Methods: Two hundred panoramic radiographs from 100 patients (50 males and 50 females) were randomly selected from a private radiological service database. Initially, 14 linear and angular measurements of the radiographs were made by an expert. Eight ratio indices derived from the original measurements were applied to a statistical algorithm to match radiographs from the same patients, simulating a semi-automated personal identification process. Subsequently, measurements were automatically generated using a deep neural network for image recognition, simulating a fully automated personal identification process. Results: Approximately 85% of the radiographs were correctly matched by the automated personal identification process. In a limited number of cases, the image recognition algorithm identified 2 potential matches for the same individual. No statistically significant differences were found between measurements performed by the expert on panoramic radiographs from the same patients. Conclusion: Personal identification might be performed with the aid of image recognition algorithms and machine learning techniques. This approach will likely facilitate the complex task of personal identification by performing an initial screening of radiographs and matching ante-mortem and post-mortem images from the same individuals.

De-identifying Unstructured Medical Text and Attribute-based Utility Measurement (의료 비정형 텍스트 비식별화 및 속성기반 유용도 측정 기법)

  • Ro, Gun;Chun, Jonghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.1
    • /
    • pp.121-137
    • /
    • 2019
  • De-identification is a method by which the remaining information can not be referred to a specific individual by removing the personal information from the data set. As a result, de-identification can lower the exposure risk of personal information that may occur in the process of collecting, processing, storing and distributing information. Although there have been many studies in de-identification algorithms, protection models, and etc., most of them are limited to structured data, and there are relatively few considerations on de-identification of unstructured data. Especially, in the medical field where the unstructured text is frequently used, many people simply remove all personally identifiable information in order to lower the exposure risk of personal information, while admitting the fact that the data utility is lowered accordingly. This study proposes a new method to perform de-identification by applying the k-anonymity protection model targeting unstructured text in the medical field in which de-identification is mandatory because privacy protection issues are more critical in comparison to other fields. Also, the goal of this study is to propose a new utility metric so that people can comprehend de-identified data set utility intuitively. Therefore, if the result of this research is applied to various industrial fields where unstructured text is used, we expect that we can increase the utility of the unstructured text which contains personal information.

IDENTIFICATION OF FALSIFIED DRUGS USING NEAR-INFRARED SPECTROSCOPY

  • Scafi, Sergio H.F.;Pasquini, Celio
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.3112-3112
    • /
    • 2001
  • Near-Infrared Spectroscopy (NIRS) was investigated aiming at the identification of falsified drugs. The identification is based on comparison of the NIR spectrum of a sample with a typical spectra of an authentic drug using multivariate modelling and classification algorithms (PCA/SIMCA). Two spectrophotometers (Brimrose - Luminar 2000 and 2030), based on acoustic-optical filter (AOTF) technology, sharing the same controlling computer, software (Brimrose - Snap 2.03) and the data acquisition electronics, were employed. The Luminar 2000 scans the range 850 1800 nm and was employed for transmitance/absorbance measurements of liquids with a transflectance optical bundle probe with total optical path of 5 mm and a circular area of 0.5 $\textrm{cm}^2$. Model 2030 scans the rage 1100 2400 nm and was employed for reflectance measurement of solids drugs. 300 spectra, acquired in about 20 s, were averaged for each sample. Chemometric treatment of the spectral data, modelling and classification were performed by using the Unscrambler 7.5 software (CAMO Norway). This package provides the Principal Component Analysis (PCA) and SIMCA algorithms, used for modelling and classification, respectively. Initially, NIRS was evaluated for spectrum acquisition of various drugs, selected in order to accomplish the diversity of physico-chemical characteristics found among commercial products. Parameters which could affect the spectra of a given drug (especially if presented as solid tablets) were investigated and the results showed that the first derivative can minimize spectral changes associated with tablet geometry, physical differences in their faces and position in relation to the probe beam. The effect of ambient humidity and temperature were also investigated. The first factor needs to be controlled for model construction because the ambient humidity can cause spectral alterations that should cause the wrong classification of a real drug if the factor is not considered by the model.

  • PDF

A Study on Efficient Data De-Identification Method for Blockchain DID

  • Min, Youn-A
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.60-66
    • /
    • 2021
  • Blockchain is a technology that enables trust-based consensus and verification based on a decentralized network. Distributed ID (DID) is based on a decentralized structure, and users have the right to manage their own ID. Recently, interest in self-sovereign identity authentication is increasing. In this paper, as a method for transparent and safe sovereignty management of data, among data pseudonymization techniques for blockchain use, various methods for data encryption processing are examined. The public key technique (homomorphic encryption) has high flexibility and security because different algorithms are applied to the entire sentence for encryption and decryption. As a result, the computational efficiency decreases. The hash function method (MD5) can maintain flexibility and is higher than the security-related two-way encryption method, but there is a threat of collision. Zero-knowledge proof is based on public key encryption based on a mutual proof method, and complex formulas are applied to processes such as personal identification, key distribution, and digital signature. It requires consensus and verification process, so the operation efficiency is lowered to the level of O (logeN) ~ O(N2). In this paper, data encryption processing for blockchain DID, based on zero-knowledge proof, was proposed and a one-way encryption method considering data use range and frequency of use was proposed. Based on the content presented in the thesis, it is possible to process corrected zero-knowledge proof and to process data efficiently.

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • v.25 no.2
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

Uncertainty in Operational Modal Analysis of Hydraulic Turbine Components

  • Gagnon, Martin;Tahan, S.-Antoine;Coutu, Andre
    • International Journal of Fluid Machinery and Systems
    • /
    • v.2 no.4
    • /
    • pp.278-285
    • /
    • 2009
  • Operational modal analysis (OMA) allows modal parameters, such as natural frequencies and damping, to be estimated solely from data collected during operation. However, a main shortcoming of these methods resides in the evaluation of the accuracy of the results. This paper will explore the uncertainty and possible variations in the estimates of modal parameters for different operating conditions. Two algorithms based on the Least Square Complex Exponential (LSCE) method will be used to estimate the modal parameters. The uncertainties will be calculated using a Monte-Carlo approach with the hypothesis of constant modal parameters at a given operating condition. In collaboration with Andritz-Hydro Ltd, data collected on two different stay vanes from an Andritz-Hydro Ltd Francis turbine will be used. This paper will present an overview of the procedure and the results obtained.

NOISE SOURCE IDENTIFICATION WITH INCREASED SPATIAL RESOLUTION

  • Gade, Svend;Hald, Jorgen;Ginn, Bernard
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2012.10a
    • /
    • pp.636-642
    • /
    • 2012
  • Delay-and-sum (DAS) Planar Beamforming has been a widely used Noise Source Identification Technique for the last decade. It is a quick one shot measurement technique being able to map sources that are larger than the array itself. The spatial resolution is proportional to distance between array and source, and inversely proportional to wavelength, thus the resolution is only good at medium to high frequencies. Improved algorithms using iterative de-convolution techniques offers up to ten times better resolution. The principle behind these techniques is described in this paper, as well as measurement examples from the automotive industry are presented.

  • PDF

High-Throughput Screening Technique for Microbiome using MALDI-TOF Mass Spectrometry: A Review

  • Mojumdar, Abhik;Yoo, Hee-Jin;Kim, Duck-Hyun;Cho, Kun
    • Mass Spectrometry Letters
    • /
    • v.13 no.4
    • /
    • pp.106-114
    • /
    • 2022
  • A rapid and reliable approach to the identification of microorganisms is a critical requirement for large-scale culturomics analysis. MALDI-TOF MS is a suitable technique that can be a better alternative to conventional biochemical and gene sequencing methods as it is economical both in terms of cost and labor. In this review, the applications of MALDI-TOF MS for the comprehensive identification of microorganisms and bacterial strain typing for culturomics-based approaches for various environmental studies including bioremediation, plant sciences, agriculture and food microbiology have been widely explored. However, the restriction of this technique is attributed to insufficient coverage of the mass spectral database. To improve the applications of this technique for the identification of novel isolates, the spectral database should be updated with the peptide mass fingerprint (PMF) of type strains with not only microbes with clinical relevance but also from various environmental sources. Further, the development of enhanced sample processing methods and new algorithms for automation and de-replication of isolates will increase its application in microbial ecology studies.