JOURNAL BROWSE
Search
Advanced SearchSearch Tips
Analysis of Missing Data Using an Empirical Bayesian Method
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
Analysis of Missing Data Using an Empirical Bayesian Method
Yoon, Yong Hwa; Choi, Boseung;
  PDF(new window)
 Abstract
Proper missing data imputation is an important procedure to obtain superior results for data analysis based on survey data. This paper deals with both a model based imputation method and model estimation method. We utilized a Bayesian method to solve a boundary solution problem in which we applied a maximum likelihood estimation method. We also deal with a missing mechanism model selection problem using forecasting results and a comparison between model accuracies. We utilized MWPE(modified within precinct error) (Bautista et al., 2007) to measure prediction correctness. We applied proposed ML and Bayesian methods to the Korean presidential election exit poll data of 2012. Based on the analysis, the results under the missing at random mechanism showed superior prediction results than under the missing not at random mechanism.
 Keywords
Missing data;non-response;Empirical Bayesian;EM algorithm;
 Language
Korean
 Cited by
1.
An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm, Journal of the Korean Data and Information Science Society, 2016, 27, 3, 587  crossref(new windwow)
 References
1.
Agresti, A. (2002). Categorical Data Analysis, Second edition, John Wiley & Sons Inc., New Jersey.

2.
Baker, S. G. and Laird, N. M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse, Journal of the American Statistical Association, 83, 62-69. crossref(new window)

3.
Baker, S. G., Rosenberger, W. F. and Dersimonian, R. (1992). Closed-form estimates for missing counts in two-way contingency tables, Statistics in Medicine, 11, 643-657. crossref(new window)

4.
Bautista, R., Callegaro, M., Vera, J. A. and Abundis, F. (2007). Studying nonresponse in Mexican exit polls, International Journal of Public Opinion Research, 19, 492-503. crossref(new window)

5.
Chib, S. (1995). Marginal likelihood from the Gibbs output, Journal of the American Statistical Association, 90, 1313-1321. crossref(new window)

6.
Chib, S. and Jeliazkov, I. (2001). Marginal likelihood from the Metropolis-Hastings output, Journal of the American Statistical Association, 96, 270-281. crossref(new window)

7.
Choi, B., Choi, J. W. and Park, Y. S. (2009). Bayesian methods for an incomplete two-way contingency table with application to the Ohio (Buckeye state polls), Survey Methodology, 35, 37-51.

8.
Choi, B. and Kim, G. M. (2012). A model selection method using EM algorithm for missing data, Journal of the Korean Data Analysis Society, 14, 767-779.

9.
Choi, B., Kim, D. Y., Kim, K. W. and Park, Y. S. (2008). Nonignorable nonresponse imputation and rotation group bias estimation on the rotation sample survey, The Korean Journal of Applied Statistics, 21, 361-375. crossref(new window)

10.
Choi, B., Park, Y. S. and Lee, D. H. (2007). Election forecasting using pre-election survey data with nonignorable nonresponse, Journal of the Korean Data Analysis Society, 9, 2321-2333.

11.
Clarke, P. S. (2002). On boundary solutions and identifiability in categorical regression with non-ignorable non-response, Biometrical Journal, 44, 701-717. crossref(new window)

12.
Dempster, A. P., Laird, N. M. and Rubin, D. M. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society B, 4, 1-38.

13.
Forster, J. J. and Smith, P. W. (1998). Model-based inference for categorical survey data subject to nonignorable non-response, Journal of the Royal Statistical society, Series B, 60, 57-70. crossref(new window)

14.
Green, P. E. and Park, T. (2003). A Bayesian hierarchical model for categorical data with nonignorable nonresponse, Biometrics, 59, 886-896. crossref(new window)

15.
Ibrahim, J. G., Zhu, H. and Tang, N. (2008). Model selection criteria for missing-data problems using the EM algorithm, Journal of the American Statistical Association, 103, 1648-1658. crossref(new window)

16.
Little, J. A. and Rubin, D. B. (2002). Statistical analysis with missing data, Second edition, Wiley, New York.

17.
Kim, S. Y. and Kwon, S. P. (2009). The effect of survey refusal and noncontact on nonresponse error: For economically active population survey, The Korean Journal of Applied Statistics, 22, 667-676. crossref(new window)

18.
Kim, Y. W. and Nam, S. J. (2009). Forming weighting adjustment cells for unit-nonresponse in sample surveys, Communications for Statistical Applications and Methods, 16, 103-113.

19.
Kwak, J. and Choi, B. (2014). A comparison study for accuracy of exit poll based on nonresponse model, Journal of the Korean Data & Information Science Society, 25, 53-64. crossref(new window)

20.
Pak, G. D. and Shin, K. I. (2010). Non-response imputation for panel data, Communications for Statistical Applications and Methods, 17, 899-907.

21.
Park, J. S., Kang, C., and Kim, K. K. (2013). A simulation study of imputation methods for transportation corporation's survey data, Journal of the Korean Data Analysis Society, 15, 1903-1912.

22.
Park, T. and Brown, M. B. (1994). Models for categorical data with nonignorable nonresponse, Journal of the American Statistical Association, 89, 44-52. crossref(new window)

23.
Park, T. (1998). An approach to categorical data with nonignorable nonresponse, Biometrics, 54, 1579-1690. crossref(new window)

24.
Park, T. S. and Lee, S. Y. (1998). Analysis of categorical data with nonresponses, The Korean Journal of Applied Statistics, 11, 83-95.

25.
Park, Y. S., Kim, K. H., and Choi, B. (2013). Dynamic Bayesian analysis for irregularly and incompletely observed contingency tables, Journal of the Korean Statistical Society, 42, 277-289. crossref(new window)

26.
Park, Y. S. and Choi, B. (2010). Bayesian analysis for incomplete multi-way contingency tables with nonignorable nonresponse, Journal of Applied Statistics, 37, 1439-1453. crossref(new window)

27.
Rubin, D. B., Stern, H. S. and Vehovar, V. (1995). Handling "Don't know" survey responses: The case of the Slovenian Plebiscite, Journal of the American Statistical Association, 90, 822-828, nonresponse, Journal of Applied Statistics, 37, 1439-1453.

28.
Song, J. (2011). Selection of variables to form imputation classes in Hotdeck imputation, Journal of the Korean Data Analysis Society, 13, 1321-1329.

29.
Song, J. (2014). A comparison of imputation methods for multiple response questions, Journal of the Korean Data Analysis Society, 16, 691-701.

30.
Yoon, Y. H. and Choi, B. (2012). Model selection method for categorical data with non-response, Journal of the Korean Data & Information Science Society, 23, 627-641. crossref(new window)