• Title/Summary/Keyword: Small data

Search Result 10,654, Processing Time 0.041 seconds

Bayesian pooling for contingency tables from small areas

  • Jo, Aejung;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.6
    • /
    • pp.1621-1629
    • /
    • 2016
  • This paper studies Bayesian pooling for analysis of categorical data from small areas. Many surveys consist of categorical data collected on a contingency table in each area. Statistical inference for small areas requires considerable care because the subpopulation sample sizes are usually very small. Typically we use the hierarchical Bayesian model for pooling subpopulation data. However, the customary hierarchical Bayesian models may specify more exchangeability than warranted. We, therefore, investigate the effects of pooling in hierarchical Bayesian modeling for the contingency table from small areas. In specific, this paper focuses on the methods of direct or indirect pooling of categorical data collected on a contingency table in each area through Dirichlet priors. We compare the pooling effects of hierarchical Bayesian models by fitting the simulated data. The analysis is carried out using Markov chain Monte Carlo methods.

End-to-End Delay Analysis of a Dynamic Mobile Data Traffic Offload Scheme using Small-cells in HetNets

  • Kim, Se-Jin
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.9-16
    • /
    • 2021
  • Recently, the traffic volume of mobile communications increases rapidly and the small-cell is one of the solutions using two offload schemes, i.e., local IP access (LIPA) and selected IP traffic offload (SIPTO), to reduce the end-to-end delay and amount of mobile data traffic in the core network (CN). However, 3GPP describes the concept of LIPA and SIPTO and there is no decision algorithm to decide the path from source nodes (SNs) to destination nodes (DNs). Therefore, this paper proposes a dynamic mobile data traffic offload scheme using small-cells to decide the path based on the SN and DN, i.e., macro user equipment, small-cell user equipment (SUE), and multimedia server, and type of the mobile data traffic for the real-time and non-real-time. Through analytical models, it is shown that the proposed offload scheme outperforms the conventional small-cell network in terms of the delay of end-to-end mobile data communications and probability of the mobile data traffic in the CN for the heterogeneous networks.

Overview of Reliability Rank Measures for Small Sample (소표본인 경우 신뢰성 순위 척도의 고찰)

  • Choi, Sung-Woon
    • Journal of the Korea Safety Management & Science
    • /
    • v.9 no.2
    • /
    • pp.161-169
    • /
    • 2007
  • This paper presents three methods for expression of reliability measures for large and small data. First method is to express parametric estimation of cardinal reliability measure data for large sample, which requires numerous sample. Second is to obtain nonparametric distribution classification of ordinal reliability measure data for small sample. However it is difficult for field user to understand this method. Last method is to acquire parametric estimation of ordinal reliability measure data for small data. Because this method requires small sample and is comprehensive, we recommend this one among the proposed methods. Various reliability rank measures are presented.

Small Domain Estimation of the Proportion Using Survey Weights

  • Kim, Dal-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1179-1189
    • /
    • 2007
  • In this paper, we estimate the proportion of individuals having health insurance in a given year for several small domains cross-classified by age, sex and other demographic characteristics using the data provided by the National Center for Health Statistics(NCHS). We employ Bayesian as well as frequentist methodology to obtain small domain estimates and the associated measures of precision. One of the new features of our study is that we utilize the survey weights along with the model to derive the small domain estimates.

  • PDF

Derivation and Validation of Aerodynamic Parameters of Small Airplanes Using Design Software and Subjective Tests (설계용 S/W를 활용한 소형비행기의 비행특성 매개변수 추출과 주관적 시험평가방식에 관한 연구)

  • 이숙경;공지영;최유환;윤석준
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2004.05a
    • /
    • pp.142-147
    • /
    • 2004
  • It is very difficult to acquire high-fidelity flight test data for small airplanes such as typical unmanned aerial vehicles because MEMS-type small sensors used in the tests do not present reliable data in general. Besides, it is not practical to conduct expensive flight tests for low-cost small airplanes in order to simulate their flight characteristics. A practical approach to obtain acceptable flight data, including stability and control derivatives and data of weight and balance, is proposed in this study. Aircraft design software such as Darcorp's AAA is used to generate aerodynamic data for small airplanes, and moments of inertia are calculated using CATIA, structural design software. These flight data from simulation software are evaluated subjectively and tailored using simulation flight by experienced pilots, based on the certified procedures in FAA AC 120-45A and 40B, which are used for manned airplane simulators.

  • PDF

Designing an Automated Production Information Platform for Small and Medium-sized Businesses (중소기업의 자동화 생산 정보 플랫폼 구축 모델 설계)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.1
    • /
    • pp.116-122
    • /
    • 2019
  • In recent years, small and medium-sized businesses are rapidly changing to an industrial structure where process/quality/energy data aggregates can be automatically or real-time to achieve global competitiveness. In particular, real-time information analysis produced in the production process of small businesses is evolving into a new process process that analyzes, predicts, prescribes and implements significant performance of small businesses. In this paper, we propose a platform-building model that can transform the automated production information system of small businesses into big data so that they can upgrade data that is generated by small businesses. The proposed model has the capability to support operational efficiency (consulting and training) and strategic decision making of small businesses by utilizing a variety of data on the basic information of products produced by small businesses for data collection by smart SMEs. In addition, the proposed model is characterized by close cooperation between small and medium-sized businesses with different regional characteristics and areas of information sharing and system linkage.

A Bayesian model for two-way contingency tables with nonignorable nonresponse from small areas

  • Woo, Namkyo;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.245-254
    • /
    • 2016
  • Many surveys provide categorical data and there may be one or more missing categories. We describe a nonignorable nonresponse model for the analysis of two-way contingency tables from small areas. There are both item and unit nonresponse. One approach to analyze these data is to construct several tables corresponding to missing categories. We describe a hierarchical Bayesian model to analyze two-way categorical data from different areas. This allows a "borrowing of strength" of the data from larger areas to improve the reliability in the estimates of the model parameters corresponding to the small areas. Also we use a nonignorable nonresponse model with Bayesian uncertainty analysis by placing priors in nonidentifiable parameters instead of a sensitivity analysis for nonidentifiable parameters. We use the griddy Gibbs sampler to fit our models and compute DIC and BPP for model diagnostics. We illustrate our method using data from NHANES III data on thirteen states to obtain the finite population proportions.

Estimating small area proportions with kernel logistic regressions models

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.941-949
    • /
    • 2014
  • Unit level logistic regression model with mixed effects has been used for estimating small area proportions, which treats the spatial effects as random effects and assumes linearity between the logistic link and the covariates. However, when the functional form of the relationship between the logistic link and the covariates is not linear, it may lead to biased estimators of the small area proportions. In this paper, we relax the linearity assumption and propose two types of kernel-based logistic regression models for estimating small area proportions. We also demonstrate the efficiency of our propose models using simulated data and real data.

A Study on Korean Sentiment Analysis Rate Using Neural Network and Ensemble Combination

  • Sim, YuJeong;Moon, Seok-Jae;Lee, Jong-Youg
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.268-273
    • /
    • 2021
  • In this paper, we propose a sentiment analysis model that improves performance on small-scale data. A sentiment analysis model for small-scale data is proposed and verified through experiments. To this end, we propose Bagging-Bi-GRU, which combines Bi-GRU, which learns GRU, which is a variant of LSTM (Long Short-Term Memory) with excellent performance on sequential data, in both directions and the bagging technique, which is one of the ensembles learning methods. In order to verify the performance of the proposed model, it is applied to small-scale data and large-scale data. And by comparing and analyzing it with the existing machine learning algorithm, Bi-GRU, it shows that the performance of the proposed model is improved not only for small data but also for large data.

Small Area Estimation of Unemployment Rate for the Economically Active Population Survey

  • Kim, Young-Won;Jo, Ran
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.1
    • /
    • pp.1-10
    • /
    • 2004
  • In the Korean Economically Active Population Survey(EAPS), the sample sizes for small areas are typically too small to provide reliable estimators because the EAPS has been designed to produce unemployment statistics for large areas such as Metropolitan Cities and Province. In this study, we consider the synthetic and composite estimators for the unemployment rate of small areas, and apply them to real data on Choongbook province which is from the Korean EAPS of December 2000. The mean square errors of these estimators were estimated by the Jackknife method, and the efficiencies of small area estimators were evaluated in terms of the relative standard errors and the relative root mean square errors. As a result, the composite estimator is much more efficient than other estimators and it turns out that the composite estimator can produce the reliable estimates of the unemployment rate of small areas under the current EAPS system.

  • PDF