DOI QR코드

DOI QR Code

A comparison study for accuracy of exit poll based on nonresponse model

무응답모형에 기반한 출구조사의 예측 정확성 비교 연구

  • Kwak, Jeongae (Department of Statistics, Daegu University) ;
  • Choi, Boseung (Department of Statistics and Computer Science, Daegu University)
  • 곽정애 (대구대학교 대학원 통계학과) ;
  • 최보승 (대구대학교 전산통계학과)
  • Received : 2013.11.27
  • Accepted : 2013.12.23
  • Published : 2014.01.31

Abstract

One of the major problems to forecast election, especially based on survey, is nonresponse. We may have different forecasting results depend on method of imputation. Handling nonresponse is more important in a survey about sensitive subject, such as presidential election. In this research, we consider a model based method of nonresponse imputation. A model based imputation method should be constructed based on assumption of nonresponse mechanism and may produce different results according to the nonresponse mechanism. An assumption of the nonresponse mechanism is very important precondition to forecast the accurate results. However, there is no exact way to verify assumption of the nonresponse mechanism. In this paper, we compared the accuracy of prediction and assumption of nonresponse mechanism based on the result of presidential election exit poll. We consider maximum likelihood estimation method based on EM algorithm to handle assumption of the model of nonresponse. We also consider modified within precinct error which Bautista (2007) proposed to compare the predict result.

조사를 통한 선거 예측을 수행하는 데 있어서 발생할 수 있는 문제점 가운데 하나는 무응답이라 할 수 있으며 무응답 대체에 대한 방법에 따라 예측 결과는 완전히 다른 결과를 생산해 낼 수 있다. 특히 대통령 선거와 같은 민감한 주제에 대한 선거에서는 무응답 대체가 더욱 더 중요하다. 본 연구에서는 무응답 대체의 방법으로 모형에 기반을 둔 대체 방법에 대하여 연구를 진행하였다. 모형에 기반을 둔 대체 방법에서는 무응답 체계의 가정에 따라 무응답 모형을 구축할 수 있으며 무응답 체계에 따라 각기 다른 대체 결과를 제공할 수 있다. 모형에 기반을 둔 무응답 대체 및 추정에서 적절한 무응답 체계의 가정은 정확한 모형 추정을 위한 매우 중요한 전제 조건이다. 그러나 무응답 체계의 가정에 대한 검증 절차는 아직 정확한 해법이 알려지지 않은 상황이다. 본 연구에서는 실제 자료를 이용한 모형적합을 통하여 무응답 체계 가정에 대한 정확도를 비교하고자 하였다. 2012년에 시행된 18대 대통령 선거과정에서 수행된 출구조사 결과를 이용하여 무응답 체계의 가정에 대한 검증과 모형에 의한 예측 정확도를 비교하였다. 무응답 모형의 추정과 무응답 대체를 위하여 EM 알고리즘에 기반을 둔 최대우도 추정방법을 이용하였으며 예측 결과를 비교하기 위하여 Bautista 등 (2007)이 제안한 MWPE(modified within precinct error)를 이용하였다.

Keywords

References

  1. Agresti, A. (2002). Categorical data analysis, second edition, John Wiley & Sons Inc., New Jersey.
  2. Baek, J. E., Kang, W. C., Lee, Y. J. and Park, B. J. (2002). An approach to survey data with nonresponse: evaluation of KEPEC data with BMI. Journal of Preventive Medicine and Public Health, 35, 136-140.
  3. Baker, S. G. and Laird, N. M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse. Journal of the American Statistical Association, 83, 62-69. https://doi.org/10.1080/01621459.1988.10478565
  4. Bautista, R., Callegaro, M., Vera, J. A. and Abundis, F. (2007). Studying nonresponse in mexican exit pollsm. International Journal of Public Opinion Research, 19, 492-503. https://doi.org/10.1093/ijpor/edm013
  5. Chambers, R. L. and Welsh, A. H. (1993). Log-linear models for survey data with non-ignorable nonresponse. Journal of Royal Statistical Society B, 55, 157-170.
  6. Cho, Y. S., Chun, Y. M. and Hwang, D. Y. (2008). An imputation for nonresponses in the survey on the rural living indicators. The Korean Journal of Applied Statistics, 21, 95-107. https://doi.org/10.5351/KJAS.2008.21.1.095
  7. Choi, B., Choi, J. W. and Park, Y. S. (2009). Bayesian methods for an incomplete two-way contingency table with application to the Ohio (Buckeye state polls). Survey Methodology, 35, 37-51.
  8. Choi, B. and Kim, G. M. (2012). A model selection method using EM algorithm for missing data. Journal of the Korean Data Analysis Society, 14, 767-779.
  9. Choi, B., Kim, D. Y., Kim, K. W. and Park, Y. S. (2008). Nonignorable nonresponse imputation and rotation group bias estimation on the rotation sample survey. The Korean Journal of Applied Statistics, 21, 361-375. https://doi.org/10.5351/KJAS.2008.21.3.361
  10. Choi, B., Park, Y. S. and Lee, D. H. (2007). Election forecasting using pre-election survey data with nonignorable nonresponse. Journal of the Korean Data Analysis Society, 9, 2321-2333.
  11. Crespi, I. (1988). Pre-election polling: Sources of accuracy and error, Russel Sage, New York.
  12. Dempster, A. P., Laird, N. M. and Rubin, D. M. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 4, 1-38.
  13. Fay, R. E. (1986). Causal models for patterns of nonresponse. Journal of the American Statistical Association, 81, 354-365. https://doi.org/10.1080/01621459.1986.10478279
  14. Hong, N. R. and Huh, M. H. (2001). A post-examination of forecasting survey for the 16th general election. The Korean Association for Survey Research, 2, 1-35.
  15. Hyun, K. B. (2005). A study on the election poll and its accuracy in th 17th general election. Korea Regional Communication Research Association, 5, 301-336.
  16. Ibrahim, J. G., Zhu, H. and Tang, N. (2008). Model selection criteria for missing-data problems using the EM algorithm. Journal of the American Statistical Association, 103, 1648-1658. https://doi.org/10.1198/016214508000001057
  17. Kim, K. S. (2000). Imputation methods for nonresponse and their effect. The Korean Association for Survey Research, 1, 1-14.
  18. Kim, Y. W. and Choi, Y. J. (2011). Systematic forecasting bias of exit poll: Analysis of exit poll for 2010 local elections. The Korean Association for Survey Research, 12, 25-48.
  19. Kim, Y. W. and Kim, J. H. (2007). An overview of exit polls for the 2006 local elections. The Korean Association for Survey Research, 8, 55-79.
  20. Kim, Y. W. and Kwak, E. S. (2010). A total survey error analysis of the exit polling for general election 2008 in korea. The Korean Association for Survey Research, 11, 33-55.
  21. Lee, H. J. and Kang, S. B. (2012). Handling the nonresponse in sample survey. Journal of the Korean Data & Information Science Society, 23, 1183-1194. https://doi.org/10.7465/jkdi.2012.23.6.1183
  22. Lee, J. H., Kim, j. and Lee, K. J. (2006). Missing imputation methods using the spatial variable in sample survey. The Korean Journal of Applied Statistics, 19, 57-67. https://doi.org/10.5351/KJAS.2006.19.1.057
  23. Little, J. A. and Rubin, D. B. (2002). Statistical analysis with missing data, second edition, Wiley, New York.
  24. Park, T. and Brown, M. B. (1994). Models for categorical data with nonignorable nonresponse. Journal of the American Statistical Association, 89, 44-52. https://doi.org/10.1080/01621459.1994.10476444
  25. Park, T. S. and Lee, S. Y. (1998). Analysis of categorical data with nonresponses. The Korean Journal of Applied Statistics, 11, 83-95.
  26. Park, Y. S. and Choi, B. (2010). Bayesian analysis for incomplete multi-way contingency tables with nonignorable nonresponse. Journal of Applied Statistics, 37, 1439-1453. https://doi.org/10.1080/02664760903046078
  27. Rhee, J. W. (2004). Problems of the election forecasting in the 2004 korean general election. Journal of Communication Research, 41, 110-135.
  28. Ryu, J. B. (2000). A plan of improving the reliability of the electon forecasting survey - A case of the 16th general election. The Korean Association for Survey Research, 1, 15-34.
  29. Ryu, J. B. (2003). A history and th improvable direction of exit poll. The Korean Association for Survey Research, 4, 31-48.
  30. Yoon, Y. H. and Choi, B. (2012). Model selection method for categorical data with non-response. Journal of the Korean Data & Information Science Society, 23, 627-641. https://doi.org/10.7465/jkdi.2012.23.4.627

Cited by

  1. Bias caused by nonresponses and suggestion for increasing response rate in the telephone survey on election vol.27, pp.2, 2016, https://doi.org/10.7465/jkdi.2016.27.2.315
  2. An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm vol.27, pp.3, 2016, https://doi.org/10.7465/jkdi.2016.27.3.587
  3. Analysis of Missing Data Using an Empirical Bayesian Method vol.27, pp.6, 2014, https://doi.org/10.5351/KJAS.2014.27.6.1003
  4. Interval prediction on the sum of binary random variables indexed by a graph vol.26, pp.3, 2019, https://doi.org/10.29220/csam.2019.26.3.261