• Title/Summary/Keyword: Sample pooling

Search Result 23, Processing Time 0.028 seconds

Bayesian pooling for contingency tables from small areas

  • Jo, Aejung;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.6
    • /
    • pp.1621-1629
    • /
    • 2016
  • This paper studies Bayesian pooling for analysis of categorical data from small areas. Many surveys consist of categorical data collected on a contingency table in each area. Statistical inference for small areas requires considerable care because the subpopulation sample sizes are usually very small. Typically we use the hierarchical Bayesian model for pooling subpopulation data. However, the customary hierarchical Bayesian models may specify more exchangeability than warranted. We, therefore, investigate the effects of pooling in hierarchical Bayesian modeling for the contingency table from small areas. In specific, this paper focuses on the methods of direct or indirect pooling of categorical data collected on a contingency table in each area through Dirichlet priors. We compare the pooling effects of hierarchical Bayesian models by fitting the simulated data. The analysis is carried out using Markov chain Monte Carlo methods.

Pooling shrinkage estimator of reliability for exponential failure model using the sampling plan (n, C, T)

  • Al-Hemyari, Z.A.;Jehel, A.K.
    • International Journal of Reliability and Applications
    • /
    • v.12 no.1
    • /
    • pp.61-77
    • /
    • 2011
  • One of the most important problems in the estimation of the parameter of the failure model, is the cost of experimental sampling units, which can be reduced by using any prior information available about ${\theta}$, and devising a two-stage pooling shrunken estimation procedure. We have proposed an estimator of the reliability function (R(t)) of the exponential model using two-stage time censored data when a prior value about the unknown parameter (${\theta}$) is available from the past. To compare the performance of the proposed estimator with the classical estimator, computer intensive calculations for bias, mean squared error, relative efficiency, expected sample size and percentage of the overall sample size saved expressions, were done for varying the constants involved in the proposed estimator (${\tilde{R}}$(t)).

  • PDF

Evaluation of a Sample-Pooling Technique in Estimating Bioavailability of a Compound for High-Throughput Lead Optimazation (혈장 시료 풀링을 통한 신약 후보물질의 흡수율 고효율 검색기법의 평가)

  • Yi, In-Kyong;Kuh, Hyo-Jeong;Chung, Suk-Jae;Lee, Min-Haw;Shim, Chang-Koo
    • Journal of Pharmaceutical Investigation
    • /
    • v.30 no.3
    • /
    • pp.191-199
    • /
    • 2000
  • Genomics is providing targets faster than we can validate them and combinatorial chemistry is providing new chemical entities faster than we can screen them. Historically, the drug discovery cascade has been established as a sequential process initiated with a potency screening against a selected biological target. In this sequential process, pharmacokinetics was often regarded as a low-throughput activity. Typically, limited pharmacokinetics studies would be conducted prior to acceptance of a compound for safety evaluation and, as a result, compounds often failed to reach a clinical testing due to unfavorable pharmacokinetic characteristics. A new paradigm in drug discovery has emerged in which the entire sample collection is rapidly screened using robotized high-throughput assays at the outset of the program. Higher-throughput pharmacokinetics (HTPK) is being achieved through introduction of new techniques, including automation for sample preparation and new experimental approaches. A number of in vitro and in vivo methods are being developed for the HTPK. In vitro studies, in which many cell lines are used to screen absorption and metabolism, are generally faster than in vivo screening, and, in this sense, in vitro screening is often considered as a real HTPK. Despite the elegance of the in vitro models, however, in vivo screenings are always essential for the final confirmation. Among these in vivo methods, cassette dosing technique, is believed the methods that is applicable in the screening of pharmacokinetics of many compounds at a time. The widespread use of liquid chromatography (LC) interfaced to mass spectrometry (MS) or tandem mass spectrometry (MS/MS) allowed the feasibility of the cassette dosing technique. Another approach to increase the throughput of in vivo screening of pharmacokinetics is to reduce the number of sample analysis. Two common approaches are used for this purpose. First, samples from identical study designs but that contain different drug candidate can be pooled to produce single set of samples, thus, reducing sample to be analyzed. Second, for a single test compound, serial plasma samples can be pooled to produce a single composite sample for analysis. In this review, we validated the issue whether the second method can be applied to practical screening of in vivo pharmacokinetics using data from seven of our previous bioequivalence studies. For a given drug, equally spaced serial plasma samples were pooled to achieve a 'Pooled Concentration' for the drug. An area under the plasma drug concentration-time curve (AUC) was then calculated theoretically using the pooled concentration and the predicted AUC value was statistically compared with the traditionally calculated AUC value. The comparison revealed that the sample pooling method generated reasonably accurate AUC values when compared with those obtained by the traditional approach. It is especially noteworthy that the accuracy was obtained by the analysis of only one sample instead of analyses of a number of samples that necessitates a significant man-power and time. Thus, we propose the sample pooling method as an alternative to in vivo pharmacokinetic approach in the selection potential lead(s) from combinatorial libraries.

  • PDF

Revisiting Deep Learning Model for Image Quality Assessment: Is Strided Convolution Better than Pooling? (영상 화질 평가 딥러닝 모델 재검토: 스트라이드 컨볼루션이 풀링보다 좋은가?)

  • Uddin, AFM Shahab;Chung, TaeChoong;Bae, Sung-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.29-32
    • /
    • 2020
  • Due to the lack of improper image acquisition process, noise induction is an inevitable step. As a result, objective image quality assessment (IQA) plays an important role in estimating the visual quality of noisy image. Plenty of IQA methods have been proposed including traditional signal processing based methods as well as current deep learning based methods where the later one shows promising performance due to their complex representation ability. The deep learning based methods consists of several convolution layers and down sampling layers for feature extraction and fully connected layers for regression. Usually, the down sampling is performed by using max-pooling layer after each convolutional block. We reveal that this max-pooling causes information loss despite of knowing their importance. Consequently, we propose a better IQA method that replaces the max-pooling layers with strided convolutions to down sample the feature space and since the strided convolution layers have learnable parameters, they preserve optimal features and discard redundant information, thereby improve the prediction accuracy. The experimental results verify the effectiveness of the proposed method.

  • PDF

The Korea Cohort Consortium: The Future of Pooling Cohort Studies

  • Lee, Sangjun;Ko, Kwang-Pil;Lee, Jung Eun;Kim, Inah;Jee, Sun Ha;Shin, Aesun;Kweon, Sun-Seog;Shin, Min-Ho;Park, Sangmin;Ryu, Seungho;Yang, Sun Young;Choi, Seung Ho;Kim, Jeongseon;Yi, Sang-Wook;Kang, Daehee;Yoo, Keun-Young;Park, Sue K.
    • Journal of Preventive Medicine and Public Health
    • /
    • v.55 no.5
    • /
    • pp.464-474
    • /
    • 2022
  • Objectives: We introduced the cohort studies included in the Korean Cohort Consortium (KCC), focusing on large-scale cohort studies established in Korea with a prolonged follow-up period. Moreover, we also provided projections of the follow-up and estimates of the sample size that would be necessary for big-data analyses based on pooling established cohort studies, including population-based genomic studies. Methods: We mainly focused on the characteristics of individual cohort studies from the KCC. We developed "PROFAN", a Shiny application for projecting the follow-up period to achieve a certain number of cases when pooling established cohort studies. As examples, we projected the follow-up periods for 5000 cases of gastric cancer, 2500 cases of prostate and breast cancer, and 500 cases of non-Hodgkin lymphoma. The sample sizes for sequencing-based analyses based on a 1:1 case-control study were also calculated. Results: The KCC consisted of 8 individual cohort studies, of which 3 were community-based and 5 were health screening-based cohorts. The population-based cohort studies were mainly organized by Korean government agencies and research institutes. The projected follow-up period was at least 10 years to achieve 5000 cases based on a cohort of 0.5 million participants. The mean of the minimum to maximum sample sizes for performing sequencing analyses was 5917-72 102. Conclusions: We propose an approach to establish a large-scale consortium based on the standardization and harmonization of existing cohort studies to obtain adequate statistical power with a sufficient sample size to analyze high-risk groups or rare cancer subtypes.

Effective High-Throughput Blood Pooling Strategy before DNA Extraction for Detection of Malaria in Low-Transmission Settings

  • Nyunt, Myat Htut;Kyaw, Myat Phone;Thant, Kyaw Zin;Shein, Thinzer;Han, Soe Soe;Zaw, Ni Ni;Han, Jin-Hee;Lee, Seong-Kyun;Muh, Fauzi;Kim, Jung-Yeon;Cho, Shin-Hyeong;Lee, Sang-Eun;Yang, Eun-Jeong;Chang, Chulhun L.;Han, Eun-Taek
    • Parasites, Hosts and Diseases
    • /
    • v.54 no.3
    • /
    • pp.253-259
    • /
    • 2016
  • In the era of (pre) elimination setting, the prevalence of malaria has been decreasing in most of the previously endemic areas. Therefore, effective cost- and time-saving validated pooling strategy is needed for detection of malaria in low transmission settings. In this study, optimal pooling numbers and lowest detection limit were assessed using known density samples prepared systematically, followed by genomic DNA extraction and nested PCR. Pooling strategy that composed of 10 samples in 1 pool, $20{\mu}l$ in 1 sample, was optimal, and the parasite density as low as $2p/{\mu}l$ for both falciparum and vivax infection was enough for detection of malaria. This pooling method showed effectiveness for handling of a huge number of samples in low transmission settings (<9% positive rate). The results indicated that pooling of the blood samples before DNA extraction followed by usual nested PCR is useful and effective for detection of malaria in screening of hidden cases in low-transmission settings.

Species Diversity of a Stratified Hornbeam Community in Kwangneung Forest (광릉산림에 있어서 서나무군집의 층에 따른 종다양성에 관한 연구)

  • 이광석;장남기
    • Asian Journal of Turfgrass Science
    • /
    • v.9 no.2
    • /
    • pp.131-136
    • /
    • 1995
  • The herb, shrub, understory and canopy strata, which arbitrarily delineated by size classes, were sampled separately. The former one were sampled by the pin-point quadrat method. And remaining three by size quadrats, diversity (H= =$\Sigma$ Pi log Pi) of of each stratum was estimated for each set of census data. Species diversity within a stratum was independent of sample plot size above a minimum cumulative area. Diversity based on plotless and plot samples could he determined by the same equation, and by pooling the data needed to estimate diversity of each stratum.

  • PDF

Sample Size Determination for the Estimation of Population Density of Marine Benthos on a Tidal Flat and a Subtidal Area, Korea

  • Koh, Chul-Hwan;Kang, Seong-Gil
    • Journal of the korean society of oceanography
    • /
    • v.33 no.3
    • /
    • pp.113-122
    • /
    • 1998
  • The requisite numbers of sample replicates for the population study of soft-bottom benthos were estimated from survey data on the Songdo tidal flat and subtidal zone in Youngil Bay, Korea. Large numbers of samples were taken; two-hundred-fifty 0.02 m$^2$ box corers and fifty 0.1m$^2$ van Veen grabs were taken on the Songdo tidal flat and in Youngil Bay, respectively. The effect of sampler size on sampling efforts was investigated by pooling the unit samples in pairs, fours, eights, etc. The requisite number of sample replicates (n$_r$) was determined by sample variance (s$^2$) and mean (m) function (n$_r$:s$^2$/P$^2$m$^2$), at P=0.2 level, in which s$^2$ and m were calculated from the counts of individuals collected. For example, seven samples of 0.02 m$^2$ corer for the intertidal and two samples of 0.1 m$^2$ van Veen grab for subtidal fauna were required to estimate the total density of community. The smaller sampler size was more efficient than larger ones when sampling costs were compared on the basis of the total sampling area. The requisite number of sample replicates was also predicted ($\^{n}$n$_r$) by substituting $\^{s}$$^2$ obtained from the regression of s$^2$ against m using the Taylor's power law ($\^{s}$$^2$:am$^b$). The regression line of survey data on s$^2$ and m plotted on log scale was well fitted to the Taylor's power law (r$^2$${\geq}$0.95, p<;0.001) over the whole range of m. The exponent b was, however, varied when it was estimated from m which was categorized into classes by its scale. The fitted exponent b was large when both density class and the sampler size were large. The number of sample replicates, therefore, could be more significantly estimated, if regression coefficients (a and b) would be calculated from sample variance and mean categorized into density classes.

  • PDF

One-step deep learning-based method for pixel-level detection of fine cracks in steel girder images

  • Li, Zhihang;Huang, Mengqi;Ji, Pengxuan;Zhu, Huamei;Zhang, Qianbing
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.153-166
    • /
    • 2022
  • Identifying fine cracks in steel bridge facilities is a challenging task of structural health monitoring (SHM). This study proposed an end-to-end crack image segmentation framework based on a one-step Convolutional Neural Network (CNN) for pixel-level object recognition with high accuracy. To particularly address the challenges arising from small object detection in complex background, efforts were made in loss function selection aiming at sample imbalance and module modification in order to improve the generalization ability on complicated images. Specifically, loss functions were compared among alternatives including the Binary Cross Entropy (BCE), Focal, Tversky and Dice loss, with the last three specialized for biased sample distribution. Structural modifications with dilated convolution, Spatial Pyramid Pooling (SPP) and Feature Pyramid Network (FPN) were also performed to form a new backbone termed CrackDet. Models of various loss functions and feature extraction modules were trained on crack images and tested on full-scale images collected on steel box girders. The CNN model incorporated the classic U-Net as its backbone, and Dice loss as its loss function achieved the highest mean Intersection-over-Union (mIoU) of 0.7571 on full-scale pictures. In contrast, the best performance on cropped crack images was achieved by integrating CrackDet with Dice loss at a mIoU of 0.7670.

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.