• 제목/요약/키워드: sampling bias

검색결과 183건 처리시간 0.026초

Adjusting sampling bias in case-control genetic association studies

  • Seo, Geum Chu;Park, Taesung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.1127-1135
    • /
    • 2014
  • Genome-wide association studies (GWAS) are designed to discover genetic variants such as single nucleotide polymorphisms (SNPs) that are associated with human complex traits. Although there is an increasing interest in the application of GWAS methodologies to population-based cohorts, many published GWAS have adopted a case-control design, which raise an issue related to a sampling bias of both case and control samples. Because of unequal selection probabilities between cases and controls, the samples are not representative of the population that they are purported to represent. Therefore, non-random sampling in case-control study can potentially lead to inconsistent and biased estimates of SNP-trait associations. In this paper, we proposed inverse-probability of sampling weights based on disease prevalence to eliminate a case-control sampling bias in estimation and testing for association between SNPs and quantitative traits. We apply the proposed method to a data from the Korea Association Resource project and show that the standard estimators applied to the weighted data yield unbiased estimates.

전화조사를 위한 시간균형할당표본추출 (Time-Balanced Quota Sampling for Telephone Survey)

  • 허명회;황진모
    • 한국조사연구학회지:조사연구
    • /
    • 제7권2호
    • /
    • pp.39-52
    • /
    • 2006
  • 우리나라 대다수 조사전문기관은 지역 성 나이대 할당표본추출에 의한 전화조사를 하고 있다. 그러나 평일에는 인구사회적 속성에 따른 개인별 재택률의 차이가 심하므로 체계적 응답자선택편향(respondent selection bias)이 우려된다. 문제 해결을 위해 조사시간대를 할당변수로 추가한 '시간균형할당표본추출'(time-balanced quota sampling) 방법과 저녁시간대 할당을 부분적으로 완화한 '시간균형준할당표본추출'(time-balanced quasi-quota sampling) 방법을 제안한다. 그리고 우리나라 통계청에서 2004년에 수집한 생활시간조사 원자료를 가상적 모집단으로 설정하여 새로운 할당추출법과 기존할당추출법에 의해 얻는 몬테칼로 표본들을 비교할 것이다.

  • PDF

Efficient Markov Chain Monte Carlo for Bayesian Analysis of Neural Network Models

  • Paul E. Green;Changha Hwang;Lee, Sangbock
    • Journal of the Korean Statistical Society
    • /
    • 제31권1호
    • /
    • pp.63-75
    • /
    • 2002
  • Most attempts at Bayesian analysis of neural networks involve hierarchical modeling. We believe that similar results can be obtained with simpler models that require less computational effort, as long as appropriate restrictions are placed on parameters in order to ensure propriety of posterior distributions. In particular, we adopt a model first introduced by Lee (1999) that utilizes an improper prior for all parameters. Straightforward Gibbs sampling is possible, with the exception of the bias parameters, which are embedded in nonlinear sigmoidal functions. In addition to the problems posed by nonlinearity, direct sampling from the posterior distributions of the bias parameters is compounded due to the duplication of hidden nodes, which is a source of multimodality. In this regard, we focus on sampling from the marginal posterior distribution of the bias parameters with Markov chain Monte Carlo methods that combine traditional Metropolis sampling with a slice sampler described by Neal (1997, 2001). The methods are illustrated with data examples that are largely confined to the analysis of nonparametric regression models.

Comparison of Latin Hypercube Sampling and Simple Random Sampling Applied to Neural Network Modeling of HfO2 Thin Film Fabrication

  • Lee, Jung-Hwan;Ko, Young-Don;Yun, Il-Gu;Han, Kyong-Hee
    • Transactions on Electrical and Electronic Materials
    • /
    • 제7권4호
    • /
    • pp.210-214
    • /
    • 2006
  • In this paper, two sampling methods which are Latin hypercube sampling (LHS) and simple random sampling were. compared to improve the modeling speed of neural network model. Sampling method was used to generate initial weights and bias set. Electrical characteristic data for $HfO_2$ thin film was used as modeling data. 10 initial parameter sets which are initial weights and bias sets were generated using LHS and simple random sampling, respectively. Modeling was performed with generated initial parameters and measured epoch number. The other network parameters were fixed. The iterative 20 minimum epoch numbers for LHS and simple random sampling were analyzed by nonparametric method because of their nonnormality.

층화 표본에서 단위 무응답에 대한 가중치 조정 방법 (The Weighting Adjustment for Unit Nonresponse in the Stratified Sampling)

  • 염준근;손창균
    • 품질경영학회지
    • /
    • 제26권3호
    • /
    • pp.82-99
    • /
    • 1998
  • In sampling survey the nonresponse reduces the precision of the estimator becuase of the nonresponse bias of the estimator. Deville, et al.(1993) considered the generalized raking procedure with the auxiliary information under five distance measures for reducing the nonresponse bias of the estimator. This paper extends the classical weighting adjustment of Deville, et al.(1993) to the stratified sampling case with three among five measures.

  • PDF

선형 측정 기법에 의해 발생하는 불연속면 방향성의 왜곡 : 서부 North Carolina의 암반 사면에서의 예 (Sampling Bias of Discontinuity Orientation Measurements for Rock Slope Design in Linear Sampling Technique : A Case Study of Rock Slopes in Western North Carolina)

  • 박혁진
    • 한국지반공학회논문집
    • /
    • 제16권1호
    • /
    • pp.145-155
    • /
    • 2000
  • 불연속면의 방향성은 암반의 과도변형이나 안정성에 영향을 미치는 특성 때문에 암반사면의 안정성 평가에 있어서 매우 중요한 역할을 한다. 불연속면의 방향측정에는 시추공(borehole)을 이용한 측정법이나 노두에서의 scanline을 이용하는 측정법과 같은 선형 측정법이 보편적으로 이용되나 이러한 측정 기법을 이용하여 획득한 자료들은 측선의 방향에 따라 쉽게 왜곡된다. 이러한 왜곡을 수정하기 위한 가중치 (weighting factor)가 적용되어도 특정 방향의 측선을 따라 자료를 획득할 경우 그 왜곡은 쉽게 보정되어지지 않는다. 즉, 불연속면의 방향자료 수집을 위해 이용된 선형 측선이 불연속면의 방향과 평행할 경우 대부분의 측선과 평행한 불연속면들은 조사 결과에 포함되지 않으며 이러한 현상은 불연속면들의 방향성 파악에 심각한 오류를 발생시킬 수 있다. 본 연구에서는 수직 측선 (borehole)에 의해 수집되어진 방향자료들과 수평 측선 (scanline)에 의해 수집되어진 방향자료들을 비교하였다. 서로 다른 두 방법에 의해 수집되어진 방향자료들은 큰 차이를 보이며, 이로 인해 불연속면들의 대표적인 방향성 결정에 장애가 되어진다. 불연속면의 경사각 분포와 수평과 수직 측선에 의해 수집되어진 자료들의 비교를 위해 등면적 극 평사투영망(polar stereo net)을 이용하였다.

  • PDF

Estimation of P(X > Y) when X and Y are dependent random variables using different bivariate sampling schemes

  • Samawi, Hani M.;Helu, Amal;Rochani, Haresh D.;Yin, Jingjing;Linder, Daniel
    • Communications for Statistical Applications and Methods
    • /
    • 제23권5호
    • /
    • pp.385-397
    • /
    • 2016
  • The stress-strength models have been intensively investigated in the literature in regards of estimating the reliability ${\theta}$ = P(X > Y) using parametric and nonparametric approaches under different sampling schemes when X and Y are independent random variables. In this paper, we consider the problem of estimating ${\theta}$ when (X, Y) are dependent random variables with a bivariate underlying distribution. The empirical and kernel estimates of ${\theta}$ = P(X > Y), based on bivariate ranked set sampling (BVRSS) are considered, when (X, Y) are paired dependent continuous random variables. The estimators obtained are compared to their counterpart, bivariate simple random sampling (BVSRS), via the bias and mean square error (MSE). We demonstrate that the suggested estimators based on BVRSS are more efficient than those based on BVSRS. A simulation study is conducted to gain insight into the performance of the proposed estimators. A real data example is provided to illustrate the process.

A Novel Simulation Architecture of Configurational-Bias Gibbs Ensemble Monte Carlo for the Conformation of Polyelectrolytes Partitioned in Confined Spaces

  • Chun, Myung-Suk
    • Macromolecular Research
    • /
    • 제11권5호
    • /
    • pp.393-397
    • /
    • 2003
  • By applying a configurational-bias Gibbs ensemble Monte Carlo algorithm, priority simulation results regarding the conformation of non-dilute polyelectrolytes in solvents are obtained. Solutions of freely-jointed chains are considered, and a new method termed strandwise configurational-bias sampling is developed so as to effectively overcome a difficulty on the transfer of polymer chains. The structure factors of polyelectrolytes in the bulk as well as in the confined space are estimated with variations of the polymer charge density.

BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection

  • Sanggeon Yun;Seungshik Kang;Hyeokman Kim
    • Journal of Information Processing Systems
    • /
    • 제19권5호
    • /
    • pp.641-651
    • /
    • 2023
  • Malicious hate speech and gender bias comments are common in online communities, causing social problems in our society. Gender bias and hate speech detection has been investigated. However, it is difficult because there are diverse ways to express them in words. To solve this problem, we attempted to detect malicious comments in a Korean hate speech dataset constructed in 2020. We explored bidirectional encoder representations from transformers (BERT)-based deep learning models utilizing hyperparameter tuning, data sampling, and logits ensembles with a label distribution. We evaluated our model in Kaggle competitions for gender bias, general bias, and hate speech detection. For gender bias detection, an F1-score of 0.7711 was achieved using an ensemble of the Soongsil-BERT and KcELECTRA models. The general bias task included the gender bias task, and the ensemble model achieved the best F1-score of 0.7166.

전기 광학 광변조기의 바이어스 안정화를 위한 오프 레벨 샘플링 방법 (Off-level Sampling Method for Bias Stabilization of an Electro-Optic Mach-Zehnder Modulator)

  • 양충열;홍현하;김해근
    • 한국통신학회논문지
    • /
    • 제25권1B호
    • /
    • pp.42-47
    • /
    • 2000
  • 버스트 모드 패킷 트래픽 조건에서 스위칭 소광비를 최대화하는 전기 광학 광변조기의 바이어스 안정화를 위한 새로운 방법을 입증하였다. 광변조기의 오프 레벨 출력 전력을 샘플링하고 최소화함으로써 패킷 트래픽 밀도의 변화에 무관하게 높은 소광비를 갖는 광 게이트로서 동작한다

  • PDF