DOI QR코드

DOI QR Code

Permutation test for a post selection inference of the FLSA

순열검정을 이용한 FLSA의 사후추론

  • Choi, Jieun (Department of Applied Statistics, Dankook University) ;
  • Son, Won (Department of Applied Statistics, Dankook University)
  • 최지은 (단국대학교 대학원 응용통계학과) ;
  • 손원 (단국대학교 대학원 응용통계학과)
  • Received : 2021.06.01
  • Accepted : 2021.08.27
  • Published : 2021.12.31

Abstract

In this paper, we propose a post-selection inference procedure for the fused lasso signal approximator (FLSA). The FLSA finds underlying sparse piecewise constant mean structure by applying total variation (TV) semi-norm as a penalty term. However, it is widely known that this convex relaxation can cause asymptotic inconsistency in change points detection. As a result, there can remain false change points even though we try to find the best subset of change points via a tuning procedure. To remove these false change points, we propose a post-selection inference for the FLSA. The proposed procedure applies a permutation test based on CUSUM statistic. Our post-selection inference procedure is an extension of the permutation test of Antoch and Hušková (2001) which deals with single change point problems, to multiple change points detection problems in combination with the FLSA. Numerical study results show that the proposed procedure is better than naïve z-tests and tests based on the limiting distribution of CUSUM statistics.

FLSA는 총변동벌점을 이용해 구간별상수인 평균 구조를 구현하는 벌점모형으로 다중변화점 탐색을 위해 활용되고 있다. 한편, FLSA는 변화점 탐색에 있어서 점근적 일치성이 만족되지 않으므로 잡음의 크기가 0에 가깝게 수렴하는 경우에도 다수의 거짓 변화점이 식별될 수 있다는 단점이 있다. 이 연구에서는 이러한 FLSA의 문제점을 해결하기 위한 사후추론 방법으로 순열검정 방법을 제안한다. 단일변화점 모형과 관련된 순열검정 방법은 Antoch와 Hušková (2001)에 의해 제안된 바 있다. 이 연구에서는 Antoch와 Hušková (2001)의 검정절차를 확장하여 다중변화점 식별에 사용되는 FLSA와 결합함으로써 다중변화점 모형에 적용할 수 있는 순열검정절차를 제안한다. 모의실험 결과, 제안된 방법은 z-검정과 CUSUM 통계량의 극한분포에 기반을 둔 검정방법에 비해 전반적으로 우수하였으며 거짓 변화점의 식별에 유용함을 확인할 수 있었다.

Keywords

Acknowledgement

이 성과는 정부(과학기술정보통신부)의 재원으로 한국연구재단의 지원을 받아 수행된 연구임 (No. 2020R1F1A1A01051039).

References

  1. Antoch J and Huskova M (2001). Permutation tests in change point analysis, Statistics and Probability Letters, 53, 37-46. https://doi.org/10.1016/S0167-7152(01)00009-8
  2. Cho H (2016). Change-point detection in panel data via double cusum statistic, Electronic Journal of Statistics, 10, 2000-2038. https://doi.org/10.1214/16-EJS1155
  3. Friedman J, Hastie T, Hofling H, and Tibshirani R (2007). Pathwise coordinate optimization, The Annals of Applied Statistics, 1, 302-332. https://doi.org/10.1214/07-AOAS131
  4. Fryzlewicz P (2014). Wild binary segmentation for multiple change-point detection, The Annals of Statistics, 42, 2243-2281. https://doi.org/10.1214/14-AOS1245
  5. Hoefling H (2010). A path algorithm for the fused lasso signal approximator, Journal of Computational and Graphical Statistics, 19, 984-1006. https://doi.org/10.1198/jcgs.2010.09208
  6. Hyun S, G'Sell M, Tibshirani RJ (2018). Exact post-selection inference for the generalized lasso path, Electronic Journal of Statistics, 12, 1053-1097. https://doi.org/10.1214/17-EJS1363
  7. Lee JD, Sun DL, Sun Y, and Taylor JE (2016). Exact post-selection inference, with application to the lasso, Annals of Statistics, 44, 907-927.
  8. Leeb H and Potscher BM (2003). The finite-sample distribution of post-model-selection estimators and uniformversus nonuniform approximations, Econometric Theory, 19, 100-142. https://doi.org/10.1017/S0266466603191050
  9. Leeb H and Potscher BM (2006). Can one estimate the conditional distribution of post-model-selection estimators?, Annals of Mathematical Statistics, 34, 2554-2591.
  10. Niu YS, Hao N, and Zhang H (2016). Multiple change-point detection: A selective overview, Statistical Science, 31, 611-623.
  11. Olshen AB, Venkatraman E, Lucito R, and Wigler M (2004). Circular binary segmentation for the analysis of array-based dna copy number data, Biostatistics, 5, 557-572. https://doi.org/10.1093/biostatistics/kxh008
  12. Pettitt AN (1979). A non-parametric approach to the change-point problem, Journal of the Royal Statistical Society: Series C (Applied Statistics), 28, 126-135. https://doi.org/10.2307/2346729
  13. Rinaldo A (2014). Corrections to properties and refinements of the fused lasso, Retrieved December 8th from https://www.stat.cmu.edu/arinaldo/Fused_Correction.pdf
  14. Rojas CR and Wahlberg B (2014). On change point detection using the fused lasso method
  15. Sen A and Srivastava MS (1975). On tests for detecting change in mean, Annals of Statistics, 3, 98-108. https://doi.org/10.1214/aos/1176343001
  16. Son W and Lim J (2019). Modified path algorithm of fused lasso signal approximator for consistent recovery of change points, Jounal of Statistical Planning and Inference, 200, 223-238. https://doi.org/10.1016/j.jspi.2018.10.003
  17. Son W, Lim J, and Yu D (2021). Tuning parameter selection for the fused lasso signal approximator. Manuscript submitted for publication.
  18. Tibshirani R, Saunders M, Rosset S, Zhu J, and Knight K (2005). Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society: Series B, 67, 91-108. https://doi.org/10.1111/j.1467-9868.2005.00490.x