Genomic Selection for Adjacent Genetic Markers of Yorkshire Pigs Using Regularized Regression Approaches

  • Park, Minsu (Department of Statistics, Seoul National University) ;
  • Kim, Tae-Hun (Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA) ;
  • Cho, Eun-Seok (Animal Genomics and Bioinformatics Division, National Institute of Animal Science, RDA) ;
  • Kim, Heebal (Department of Agricultural Biotechnology and Research Institute for Agriculture and Life Sciences, Seoul National University) ;
  • Oh, Hee-Seok (Department of Statistics, Seoul National University)
  • Received : 2014.04.02
  • Accepted : 2014.07.21
  • Published : 2014.12.01


This study considers a problem of genomic selection (GS) for adjacent genetic markers of Yorkshire pigs which are typically correlated. The GS has been widely used to efficiently estimate target variables such as molecular breeding values using markers across the entire genome. Recently, GS has been applied to animals as well as plants, especially to pigs. For efficient selection of variables with specific traits in pig breeding, it is required that any such variable selection retains some properties: i) it produces a simple model by identifying insignificant variables; ii) it improves the accuracy of the prediction of future data; and iii) it is feasible to handle high-dimensional data in which the number of variables is larger than the number of observations. In this paper, we applied several variable selection methods including least absolute shrinkage and selection operator (LASSO), fused LASSO and elastic net to data with 47K single nucleotide polymorphisms and litter size for 519 observed sows. Based on experiments, we observed that the fused LASSO outperforms other approaches.


Genomic Selection;Pig;Litter Size;Single Nucleotide Polymorphism;Regularized Regression


Supported by : National Research Foundation of Korea (NRF), Rural Development Administration


  1. Liu, J. 2011. Penalized Methods in Genome-wide Association Studies. Ph.D. Thesis, University of Iowa, Iowa City, IA, USA.
  2. Cleveland, M., S. Forni, D. J. Garrick, and N. Deeb. 2010. Prediction of genomic breeding values in a commercial pig population. Proc 9th World Congr. Genet. Appl. Livest. Prod. Leipzig, Germany.
  3. Dekkers, J. 2002. The use of molecular genetics in the improvement of agricultural populations. Nat. Rev. Genet. 3:22-32.
  4. Ibanez-Escriche, N. and O. Gonzalez-Recio. 2011. Review. Promises, pitfalls and challenges of genomic selection in breeding programs. Span. J. Agric. Res. 9:404-413.
  5. Kizilkaya, K., R. L. Fernando, and D. J. Garrick. 2010. Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes. J. Anim. Sci. 88:544-551.
  6. Lillehammer, M., T. H. E. Meuwissen, and A. K. Sonesson. 2013. Genomic selection for two traits in a maternal pig breeding scheme. J. Anim. Sci. 91:3079-3087.
  7. Liu, J., L. Yuan, and J. Ye. 2010. An efficient algorithm for a class of fused LASSO problems. In Advances in Neural Information Processing Systems (NIPS). Curran Associates, Inc., Vancouver, BC, Canada.
  8. Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819-1829.
  9. Nesterov, Y. 2007. Gradient methods for minimizing composite objective function. CORE Discussion Paper. 76.
  10. Ogutu, J. O., T. Schulz-Streeck, and H. P. Piepho. 2012. Genomic selection using regularized linear regression models: Ridge regression, lasso, elastic net and their extensions. BMC Proc. 6(Suppl. 2):S10.
  11. Onteru, S. K., B. Fan, Z-Q. Du, D. J. Garrick, K. J. Stalder, and M. F. Rothschild. 2012. A whole-genome association study for pig reproductive traits. Anim. Genet. 43:18-26.
  12. Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, and M. Ferreira. 2007. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:559-575.
  13. Resende, Jr., M. F. R., P. Munoz, M. D. V. Resende, D. J. Garrick, R. L. Fernando, J. M. Davis, E. J. Jokela, T. A. Martin, G. F. Peter, and M. Kirst. 2012. Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190:1503-1510.
  14. Royston, J. P. 1982. An extension of Shapiro and Wilk's W test for normality to large samples. J. Appl. Statist. 31:115-124.
  15. Simianer, H. 2009. The potential of genomic selection to improve litter size in pig breeding programs. Proc 60th Annual Meeting of the European Association of Animal Production. Barcelona, Spain.
  16. Tibshirani, R. 1996. Regression shrinkage and selection via the LASSO. J. R. Statist. Soc. Ser. B. 58:267-288.
  17. Tibshirani, R., M. Saunders, J. Zhu, and K. Knight. 2005. Sparsity and smoothness via the fused LASSO. J. R. Statist. Soc. Ser. B. 67:91-108.
  18. Tibshirani, R. and J. Taylor. 2011. The solution path of the generalized LASSO. Ann. Stat. 39:1335-1371.
  19. Usai, M. G., M. E. Goddard, and B. J. Hayes. 2009. LASSO with cross-validation for genomic selection. Genet. Res. (Cambridge) 91:427-436.
  20. Wurschum, T., J. C. Reif, T. Kraft, G. Janssen, and Y. Zhao. 2013. Genomic selection in sugar beet breeding populations. BMC Genet. 14:85.
  21. Zou, H. and T. Hastie. 2005. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B. 67:301-320.

Cited by

  1. Genomic selection in pigs: state of the art and perspectives vol.15, pp.2, 2016,