• Title/Summary/Keyword: Akaike%27s information criterion

Search Result 5, Processing Time 0.026 seconds

Robust varying coefficient model using L1 regularization

  • Hwang, Changha;Bae, Jongsik;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.1059-1066
    • /
    • 2016
  • In this paper we propose a robust version of varying coefficient models, which is based on the regularized regression with L1 regularization. We use the iteratively reweighted least squares procedure to solve L1 regularized objective function of varying coefficient model in locally weighted regression form. It provides the efficient computation of coefficient function estimates and the variable selection for given value of smoothing variable. We present the generalized cross validation function and Akaike information type criterion for the model selection. Applications of the proposed model are illustrated through the artificial examples and the real example of predicting the effect of the input variables and the smoothing variable on the output.

Minimum Message Length and Classical Methods for Model Selection in Univariate Polynomial Regression

  • Viswanathan, Murlikrishna;Yang, Young-Kyu;WhangBo, Taeg-Keun
    • ETRI Journal
    • /
    • v.27 no.6
    • /
    • pp.747-758
    • /
    • 2005
  • The problem of selection among competing models has been a fundamental issue in statistical data analysis. Good fits to data can be misleading since they can result from properties of the model that have nothing to do with it being a close approximation to the source distribution of interest (for example, overfitting). In this study we focus on the preference among models from a family of polynomial regressors. Three decades of research has spawned a number of plausible techniques for the selection of models, namely, Akaike's Finite Prediction Error (FPE) and Information Criterion (AIC), Schwartz's criterion (SCH), Generalized Cross Validation (GCV), Wallace's Minimum Message Length (MML), Minimum Description Length (MDL), and Vapnik's Structural Risk Minimization (SRM). The fundamental similarity between all these principles is their attempt to define an appropriate balance between the complexity of models and their ability to explain the data. This paper presents an empirical study of the above principles in the context of model selection, where the models under consideration are univariate polynomials. The paper includes a detailed empirical evaluation of the model selection methods on six target functions, with varying sample sizes and added Gaussian noise. The results from the study appear to provide strong evidence in support of the MML- and SRM- based methods over the other standard approaches (FPE, AIC, SCH and GCV).

  • PDF

Random Regression Models Are Suitable to Substitute the Traditional 305-Day Lactation Model in Genetic Evaluations of Holstein Cattle in Brazil

  • Padilha, Alessandro Haiduck;Cobuci, Jaime Araujo;Costa, Claudio Napolis;Neto, Jose Braccini
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.6
    • /
    • pp.759-767
    • /
    • 2016
  • The aim of this study was to compare two random regression models (RRM) fitted by fourth ($RRM_4$) and fifth-order Legendre polynomials ($RRM_5$) with a lactation model (LM) for evaluating Holstein cattle in Brazil. Two datasets with the same animals were prepared for this study. To apply test-day RRM and LMs, 262,426 test day records and 30,228 lactation records covering 305 days were prepared, respectively. The lowest values of Akaike's information criterion, Bayesian information criterion, and estimates of the maximum of the likelihood function (-2LogL) were for $RRM_4$. Heritability for 305-day milk yield (305MY) was 0.23 ($RRM_4$), 0.24 ($RRM_5$), and 0.21 (LM). Heritability, additive genetic and permanent environmental variances of test days on days in milk was from 0.16 to 0.27, from 3.76 to 6.88 and from 11.12 to 20.21, respectively. Additive genetic correlations between test days ranged from 0.20 to 0.99. Permanent environmental correlations between test days were between 0.07 and 0.99. Standard deviations of average estimated breeding values (EBVs) for 305MY from $RRM_4$ and $RRM_5$ were from 11% to 30% higher for bulls and around 28% higher for cows than that in LM. Rank correlations between RRM EBVs and LM EBVs were between 0.86 to 0.96 for bulls and 0.80 to 0.87 for cows. Average percentage of gain in reliability of EBVs for 305-day yield increased from 4% to 17% for bulls and from 23% to 24% for cows when reliability of EBVs from RRM models was compared to those from LM model. Random regression model fitted by fourth order Legendre polynomials is recommended for genetic evaluations of Brazilian Holstein cattle because of the higher reliability in the estimation of breeding values.

Novel nomogram-based integrated gonadotropin therapy individualization in in vitro fertilization/intracytoplasmic sperm injection: A modeling approach

  • Ebid, Abdel Hameed IM;Motaleb, Sara M Abdel;Mostafa, Mahmoud I;Soliman, Mahmoud MA
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.48 no.2
    • /
    • pp.163-173
    • /
    • 2021
  • Objective: This study aimed to characterize a validated model for predicting oocyte retrieval in controlled ovarian stimulation (COS) and to construct model-based nomograms for assistance in clinical decision-making regarding the gonadotropin protocol and dose. Methods: This observational, retrospective, cohort study included 636 women with primary unexplained infertility and a normal menstrual cycle who were attempting assisted reproductive therapy for the first time. The enrolled women were split into an index group (n=497) for model building and a validation group (n=139). The primary outcome was absolute oocyte count. The dose-response relationship was tested using modified Poisson, negative binomial, hybrid Poisson-Emax, and linear models. The validation group was similarly analyzed, and its results were compared to that of the index group. Results: The Poisson model with the log-link function demonstrated superior predictive performance and precision (Akaike information criterion, 2,704; λ=8.27; relative standard error (λ)=2.02%). The covariate analysis included women's age (p<0.001), antral follicle count (p<0.001), basal follicle-stimulating hormone level (p<0.001), gonadotropin dose (p=0.042), and protocol type (p=0.002 and p<0.001 for short and antagonist protocols, respectively). The estimates from 500 bootstrap samples were close to those of the original model. The validation group showed model assessment metrics comparable to the index model. Based on the fitted model, a static nomogram was built to improve visualization. In addition, a dynamic electronic tool was created for convenience of use. Conclusion: Based on our validated model, nomograms were constructed to help clinicians individualize the stimulation protocol and gonadotropin doses in COS cycles.

Ecological Evaluation on the Biomass of Macrobenthic Communities Observed from a Planned Offshore Wind Farm Area, West Coast of Korea (서해 해상풍력단지 조성 예정해역의 대형저서동물 군집 생체량에 대한 생태학적 평가)

  • Jeong, Su-Young;Lee, Chae-Lin;Gim, Seong-Hyun;Kim, Sungtae;Myoung, Jung-Goo;Oh, Sung-Yong;Park, Jin Woo;Jin, Sung-Joo;Yoo, Jae-Won
    • Ocean and Polar Research
    • /
    • v.41 no.4
    • /
    • pp.311-318
    • /
    • 2019
  • We analyzed the preliminary survey data (2014-2016) of macrobenthic community biomass (n = 112) from the wind farm area located in the southern part of the west coast of Korea and compared this data with data from the entire west coast (n = 369; 2006-2008). Modal classes from frequency distributions were 6 times higher in the latter (5 vs. 32 g/㎡). The mean and median values of the latter were 1.3 and 1.7 times higher (mean, 20.7 vs. 27.8 g/㎡; median, 17.1 vs. 29.5 g/㎡), and the maximum value was 3.4 times higher. Mood's median test showed significant difference at p-value = 0.01. We estimated the biomass-to-depth relationships from each data set by using Akaike Information Criterion and regarded the non-overlap of the 95% confidence intervals as indicating significant difference. The biomass was different from a 10 m depth below, and 3 times higher in the west coast at around 20 m compared with the maximum depth of the wind farm area. A local event of catastrophic sedimentation ranging from 1 to 2 m was observed in the wind farm during winter surveys. This could be a probable source of the lower biomass, but information on biomass seasonality and a natural experimental approach seem to be needed for the conduct of further studies. This study is meaningful in that it provided the background to assess future changes by understanding the lower level of benthic productivity in the area. We expect this study will contribute to the preparation of measures that can remove or mitigate the source of the lower biomass and improve the productivity of fishery resources in the area.