- Volume 27 Issue 2
DOI QR Code
Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping
- Nam, Jin Hyun (Department of Public Health Sciences, Medical University of South Carolina) ;
- Khatiwada, Aastha (Department of Public Health Sciences, Medical University of South Carolina) ;
- Matthews, Lois J. (Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina) ;
- Schulte, Bradley A. (Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina) ;
- Dubno, Judy R. (Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina) ;
- Chung, Dongjun (Department of Public Health Sciences, Medical University of South Carolina)
- Received : 2019.10.19
- Accepted : 2019.12.20
- Published : 2020.03.31
Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.
Supported by : National Institute on Deafness and Other Communication Disorders, National Institute of General Medical Sciences, National Cancer Institute, National Institute on Drug Abuse, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Center for Advancing Translational Sciences
- Aitchison J (1982). The statistical analysis of compositional data, Journal of the Royal Statistical Society: Series B (Methodological), 44, 139-160. https://doi.org/10.1111/j.2517-6161.1982.tb01195.x
- Barnett IJ, Lee S, and Lin X (2013). Detecting rare variant effects using extreme phenotype sampling in sequencing association studies, Genetic Epidemiology, 37, 142-151. https://doi.org/10.1002/gepi.21699
- Dubno JR, Eckert MA, Lee FS, Matthews LJ, and Schmiedt RA (2013). Classifying human audiometric phenotypes of age-related hearing loss from animal models, Journal of the Association for Research in Otolaryngology, 14, 687-701. https://doi.org/10.1007/s10162-013-0396-x
- Friedman J, Hastie T, and Tibshirani R (2010). Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1-22.
- Hoerl AE and Kennard RW(1970). Ridge regression: Biased estimation for nonorthogonal problems, Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
- Huang BE and Lin D (2007). Efficient association mapping of quantitative trait loci with selective genotyping, American Journal of Human Genetics, 80, 567-576. https://doi.org/10.1086/512727
- Li D, Lewinger JP, Gauderman WJ, Murcray CE, and Conti D (2011). Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies, Genetic Epidemiology, 35, 790-799. https://doi.org/10.1002/gepi.20628
- Lin FR, Niparko JK, and Ferrucci L (2011). Hearing loss prevalence in the United States, Archives of Internal Medicine, 171, 1851-1852. https://doi.org/10.1001/archinternmed.2011.506
- Maier MJ (2014). DirichletReg: Dirichlet regression for compositional data in R, Research Report Series / Department of Statistics and Mathematics, 125, WU Vienna University of Economics and Business, Vienna.
- Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
- Vaden KI, Matthews LJ, Eckert MA, and Dubno JR (2017). Longitudinal changes in audiometric phenotypes of age-related hearing loss, Journal of the Association for Research in Otolaryngology, 18, 371-385. https://doi.org/10.1007/s10162-016-0596-2
- Van den Boogaart KG and Tolosana-Delgado R (2013). Analyzing Compositional Data with R, Springer, Heidelberg.
- Zhang Nebert DW, Chakraborty R, and Jin L (2006). Statistical power of association using the extreme discordant phenotype design, Pharmacogenetics and Genomics, 16, 401-413. https://doi.org/10.1097/01.fpc.0000204995.99429.0f
- Zhu J and Hastie T (2004). Classification of gene microarrays by penalized logistic regression, Biostatistics, 5, 427-443. https://doi.org/10.1093/biostatistics/kxg046
- Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x