• Title/Summary/Keyword: Stratified two-stage cluster sampling

Search Result 19, Processing Time 0.029 seconds

Unbiased Balanced Half-Sample Variance Estimation in Stratified Two-stage Sampling

  • Kim, Kyu-Seong
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.459-469
    • /
    • 1998
  • Balanced half sample method is a simple variance estimation method for complex sampling designs. Since it is simple and flexible, it has been widely used in large scale sample surveys. However, the usual BHS method overestimate the true variance in without replacement sampling and two-stage cluster sampling. Focusing on this point , we proposed an unbiased BHS variance estimator in a stratified two-stage cluster sampling and then described an implementation method of the proposed estimator. Finally, partially BHS design is explained as a tool of reducing the number of replications of the proposed estimator.

  • PDF

A Optimal Cluster Size in Stratified Two-Stage Cluster Sampling (층화 2-단 표본 추출시 최적 집락의 크기 결정)

  • 신민웅;신기일
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.207-224
    • /
    • 2000
  • Generally cluster size is predetermined when we use the stratified two-stage cluster sampling But in case that the sizes of clusters vary greatly one may want to make the sizes to be about equal. In this paper we study the optimal cluster size in stratified twostage cluster sampling. Also we find the optimal primary sampling unit sizes and optimal secondary sampling unit sizes under the given cost restriction.

  • PDF

A composite estimator for stratified two stage cluster sampling

  • Lee, Sang Eun;Lee, Pu Reum;Shin, Key-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.1
    • /
    • pp.47-55
    • /
    • 2016
  • Stratified cluster sampling has been widely used for effective parameter estimations due to reductions in time and cost. The probability proportional to size (PPS) sampling method is used when the number of cluster element are significantly different. However, simple random sampling (SRS) is commonly used for simplicity if the number of cluster elements are almost the same. Also it is known that the ratio estimator produces a good performance when the total number of population elements is known. However, the two stage cluster estimator should be used if the total number of elements in population is neither known nor accurate. In this study we suggest a composite estimator by combining the ratio estimator and the two stage cluster estimator to obtain a better estimate under a certain population circumstance. Simulation studies are conducted to compare the superiority of the suggested estimator with two other estimators.

Two-stage Sampling for Estimation of Prevalence of Bovine Tuberculosis (이단계표본추출을 이용한 소결핵병 유병률 추정)

  • Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.28 no.4
    • /
    • pp.422-426
    • /
    • 2011
  • For a national survey in which wide geographic region or an entire country is targeted, multi-stage sampling approach is widely used to overcome the problem of simple random sampling, to consider both herd- and animallevel factors associated with disease occurrence, and to adjust clustering effect of disease in the population in the calculation of sample size. The aim of this study was to establish sample size for estimating bovine tuberculosis (TB) in Korea using stratified two-stage sampling design. The sample size was determined by taking into account the possible clustering of TB-infected animals on individual herds to increase the reliability of survey results. In this study, the country was stratified into nine provinces (administrative unit) and herd, the primary sampling unit, was considered as a cluster. For all analyses, design effect of 2, between-cluster prevalence of 50% to yield maximum sample size, and mean herd size of 65 were assumed due to lack of information available. Using a two-stage sampling scheme, the number of cattle sampled per herd was 65 cattle, regardless of confidence level, prevalence, and mean herd size examined. Number of clusters to be sampled at a 95% level of confidence was estimated to be 296, 74, 33, 19, 12, and 9 for desired precision of 0.01, 0.02, 0.03, 0.04, 0.05, and 0.06, respectively. Therefore, the total sample size with a 95% confidence level was 172,872, 43,218, 19,224, 10,818, 6,930, and 4,806 for desired precision ranging from 0.01 to 0.06. The sample size was increased with desired precision and design effect. In a situation where the number of cattle sampled per herd is fixed ranging from 5 to 40 with a 5-head interval, total sample size with a 95% confidence level was estimated to be 6,480, 10,080, 13,770, 17,280, 20.925, 24,570, 28,350, and 31,680, respectively. The percent increase in total sample size resulting from the use of intra-cluster correlation coefficient of 0.3 was 22.2, 32.1, 36.3, 39.6, 41.9, 42.9, 42,2, and 44.3%, respectively in comparison to the use of coefficient of 0.2.

An Evaluation of Sampling Design for Estimating an Epidemiologic Volume of Diabetes and for Assessing Present Status of Its Control in Korea (우리나라 당뇨병의 역학적 규모와 당뇨병 관리현황 파악을 위한 표본설계의 평가)

  • Lee, Ji-Sung;Kim, Jai-Yong;Baik, Sei-Hyun;Park, Ie-Byung;Lee, June-Young
    • Journal of Preventive Medicine and Public Health
    • /
    • v.42 no.2
    • /
    • pp.135-142
    • /
    • 2009
  • Objectives : An appropriate sampling strategy for estimating an epidemiologic volume of diabetes has been evaluated through a simulation. Methods : We analyzed about 250 million medical insurance claims data submitted to the Health Insurance Review & Assessment Service with diabetes as principal or subsequent diagnoses, more than or equal to once per year, in 2003. The database was re-constructed to a 'patient-hospital profile' that had 3,676,164 cases, and then to a 'patient profile' that consisted of 2,412,082 observations. The patient profile data was then used to test the validity of a proposed sampling frame and methods of sampling to develop diabetic-related epidemiologic indices. Results : Simulation study showed that a use of a stratified two-stage cluster sampling design with a total sample size of 4,000 will provide an estimate of 57.04%(95% prediction range, 49.83 - 64.24%) for a treatment prescription rate of diabetes. The proposed sampling design consists, at first, stratifying the area of the nation into "metropolitan/city/county" and the types of hospital into "tertiary/secondary/primary/clinic" with a proportion of 5:10:10:75. Hospitals were then randomly selected within the strata as a primary sampling unit, followed by a random selection of patients within the hospitals as a secondly sampling unit. The difference between the estimate and the parameter value was projected to be less than 0.3%. Conclusions : The sampling scheme proposed will be applied to a subsequent nationwide field survey not only for estimating the epidemiologic volume of diabetes but also for assessing the present status of nationwide diabetes control.

Empirical Analysis on Rao-Scott First Order Adjustment for Two Population Homogeneity test Based on Stratified Three-Stage Cluster Sampling with PPS

  • Heo, Sunyeong
    • Journal of Integrative Natural Science
    • /
    • v.7 no.3
    • /
    • pp.208-213
    • /
    • 2014
  • National-wide and/or large scale sample surveys generally use complex sample design. Traditional Pearson chi-square test is not appropriate for the categorical complex sample data. Rao-Scott suggested an adjustment method for Pearson chi-square test, which uses the average of eigenvalues of design matrix of cell probabilities. This study is to compare the efficiency of Rao-Scott first order adjusted test to Wald test for homogeneity between two populations using 2009 Gyeongnam regional education offices's customer satisfaction survey (2009 GREOCSS) data. The 2009 GREOCSS data were collected based on stratified three-stage cluster sampling with probability proportional to size. The empirical results show that the Rao-Scott adjusted test statistic using only the variances of cell probabilities is very close to the Wald test statistic, which uses the covariance matrix of cell probabilities, under the 2009 GREOCSS data based. However it is necessary to be cautious to use the Rao-Scott first order adjusted test statistic in the place of Wald test because its efficiency is decreasing as the relative variance of eigenvalues of the design matrix of cell probabilities is increasing, specially more when the number of degrees of freedom is small.

Variance estimation of a double expanded estimator for two-phase sampling

  • Mingue Park
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.4
    • /
    • pp.403-410
    • /
    • 2023
  • Two-Phase sampling, which was first introduced by Neyman (1938), has various applications in different forms. Variance estimation for two-phase sampling has been an important research topic because conventional variance estimators used in most softwares are not working. In this paper, we considered a variance estimation for two-phase sampling in which stratified two-stage cluster sampling designs are used in both phases. By defining a conditionally unbiased estimator of an approximate variance estimator, which is calculable when all elements in the first phase sample are observed, we propose an explicit form of variance estimator of the double expanded estimator for a two-phase sample. A small simulation study shows the proposed variance estimator has a negligible bias with small variance. The suggested variance estimator is also applicable to other linear estimators of the population total or mean if appropriate residuals are defined.

An Empirical Study on Classification of the Housing Lifestyle in Urban (현대 도시의 주거생활양식 유형 분류에 관한 연구)

  • MockWhaChoi
    • Journal of the Korean housing association
    • /
    • v.2 no.1
    • /
    • pp.1-12
    • /
    • 1991
  • The purpose of this study was to classify the types of housing life style. Housing life style was measured using four variables : furniture usage pattern, space usage pattern, family living pattern and heating system. A final Instrument was developed through the two stage pilot surveys. The respondents were 1,292 home-makers of the middle and high economic classes In Seoul and Daejeon, selected through stratified random sampling technique. Data were analyzed using SAS computer packages. The statistics used were frequency, percentage, Pear-3on`s correlation coefficient, Multiple Linear Regression, X2, and cluster analysis.The major findings were as follows : Five representative types of housing life style were found through cluster analysis. They were conventional minimum level life style, conventional optimum famiIy-centered life style, eclectic family-centered life style, contemporary optimum family - centered and contemporary so-cial, leasure-oriented life style.

  • PDF

A study on the sample design of the fishery household economy survey (어가경제조사 표본설계에 관한 연구)

  • 김규성;전종우;박홍래
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.43-54
    • /
    • 1995
  • The fishery household economy survey is a sample survey which produce estimates on the fishery household economy and fishery management in Korea. We propose a sample design for this survey. This design is developed based on results of 1990 fishery census and Shi-Do is assumed to be subpopulation for Shi-Do estimates. Samples are selected by stratified two-stage cluster sampling in Shi-Do and income function is found for stratification. Fishery household income is estimated by a linear estimator.

  • PDF

A Study of Attitudes to Changed Health Care Delivery System in a Community (보건의료제도 변화에 대한 지역주민의 수용태도 분석)

  • Yu, Seung-Hum;Sohn, Myong-Sei;Park, Jong-Yeon
    • Journal of Preventive Medicine and Public Health
    • /
    • v.22 no.1 s.25
    • /
    • pp.162-168
    • /
    • 1989
  • This study was conducted to analyse attitudes to a new health care system in a rural community. The specific purpose of this thesis was to classify attitudes to the patient referral system in Kangwha county, and to identify factors affecting the attitudes. Sampling was done by a multi-stage stratified cluster sampling method from the population. The data were collected in Kangwha county through a structured interview survey for two weeks in June, 1957. Attitudes to the patient referral system were classified into four types based upon answers to questions about awareness of the system, the recognition for the necessity of the system, and opinions on the improvement of the system. The four types of attitudes were active acceptance(10.2%), partial acceptance (27.2%), refusal(35.8%), and indifference(26.7%). The respondent's age, educational level, age of head of household, medical insurance fee, the number of ill family members, and the percentage of medical utilization by the family were the variables which affected the attitudes. The medical insurance fee, respondent's age, age of head of household, and the percentage of medical utilization by the family were the statistically significant discriminant factors of the four types of attitudes.

  • PDF