PhysioCover: Recovering the Missing Values in Physiological Data of Intensive Care Units

Kim, Sun-Hee;Yang, Hyung-Jeong;Kim, Soo-Hyung;Lee, Guee-Sang

  • 투고 : 2013.12.23
  • 심사 : 2014.05.02
  • 발행 : 2014.06.28


Physiological signals provide important clues in the diagnosis and prediction of disease. Analyzing these signals is important in health and medicine. In particular, data preprocessing for physiological signal analysis is a vital issue because missing values, noise, and outliers may degrade the analysis performance. In this paper, we propose PhysioCover, a system that can recover missing values of physiological signals that were monitored in real time. PhysioCover integrates a gradual method and EM-based Principle Component Analysis (PCA). This approach can (1) more readily recover long- and short-term missing data than existing methods, such as traditional EM-based PCA, linear interpolation, 5-average and Missing Value Singular Value Decomposition (MSVD), (2) more effectively detect hidden variables than PCA and Independent component analysis (ICA), and (3) offer fast computation time through real-time processing. Experimental results with the physiological data of an intensive care unit show that the proposed method assigns more accurate missing values than previous methods.


Intensive Care Unit;Missing Values;Hidden variable;Real Time Processing and EM-Principle Component Analysis


  1. I. Milovanovic and D. B. Popovic, "Principal Component Analysis of Gait Kinematics Data in Acute and Chronic Stroke Patients," Computational and Mathematical Methods in Medicine, vol. 2012:649743, 2012, pp. 1-8.
  2. X. Wang, A. Li, Z. Jiang, and H. Feng, "Missing value estimation for DNA microarray gene expression data by support vector regression imputation and orthogonal coding scheme," BMC Bioinformatics, vol. 7:32, 2006, pp. 1-10.
  3. G. N. Brock, J. R. Shaffer, R. E. Blakesley, M. J. Lotz, and G. C. Tseng, "Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes," BMC Bioinformatics, vol. 9, 2008, pp. 1-12.
  4. S. Sharma, P. Lingras, and M. Zhong, "Effect of missing values estimations of traffic parameters," Transportation Planning and Technology, vol. 27, 2004, pp. 119-144.
  5. J. Lee, and R. G. Mark, "An investigation of patterns in hemodynamic data indicative of impending hypotension in intensive care," BioMedical Engineering OnLin, vol. 9:62, 2010, pp. 1-17.
  6. M. P. S. Chawla, "Detection of Indeterminacies in Corrected ECG Signals Using Parameterized Multidimensional Independent Component Analysis," Computational and Mathematical Methods in Medicine, vol. 10, 2009, pp. 85-115.
  7. X. Jiang, L. Zhang, Q. Zhao, and S. Albayrak, "ECG Arrhythmias Recognition System Based on Independent Component Analysis Feature Extraction," Proc. TENCON'06, 2006, pp. 1-4.
  8. F. Chiarugi, I. Karatzanis, V. Sakkalis, I. Tsamardinos, Th. Dermitzaki, M. Foukarakis, and G Vrouchos, "Predicting the Occurrence of Acute Hypotensive Episodes: The PhysioNet Challenge," Computers in Cardiology, vol. 36, 2009, pp. 621-624.
  9. J. Sun, S. Papadimitriou, and C. Faloutsos, "Online Latent Variable Detection in Sensor Networks," Proc. ICDE'05, 2005, pp. 1126-1127.
  10. J. Y. Pan, H. Kitagawa, M. Hamamoto, and C. Faloutsos, "AutoSplit: Fast and Scalable Discovery of Hidden Variables in Stream and Multimedia Databases," Proc. PAKDD'05, 2005, pp. 519-528.
  12. L. Zhao, T. Chai, and Q. Cong, "Operating Condition Recognition of Predenitrification Bioprocess Using Robust EMPCA and FCM," Proc. WCICA'06, 2006, pp. 9386-9390.
  13. O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, B. Botstein, and R. B. Altman, "Missing value estimation methods for DNA microarrays," Bioinformatics, vol. 17, 2001, pp. 520-525.
  14. K. S. Ng, H. J. Yang, and S. H. Kim, "Hidden pattern discovery on event related potential EEG signals," Biosystems, vol. 97, 2009, pp. 15-27.
  15. X. T. Doan, R. Srinivasan, P. M. Bapat, and P. P. Wangikar, "Detection of phase shifts in batch fermentation via statistical analysis of the online measurements: A case study with rifamycin B fermentation," Journal of Biotechnology, vol. 132, 2007, pp. 156-166.
  16. L. Lehman, M. Saeed, G. Moody, and R. Mark, "Similarity-based searching in multi-parameter time series databases," Computers in Cardiology, vol. 35, 2008, pp. 653-656.
  17. M. N. Norazian, Y. A. Shukri, R. N. Azam, and A. M. M. Al Bakri, "Estimation of missing values in air pollution data using single imputation techniques," ScienceAsia, vol. 34, 2008, pp. 341-345.
  18. I. Stanimirova, M. Daszykowski, and B. Walczak, "Dealing with missing values and outliers in principal component analysis," Talanta, vol. 72, 2007, pp. 172-178.
  19. F. Canento, A. Fred, H. Silva, H. Gamboa, and A. Lourenco, "Multimodal biosignal sensor data handling for emotion recognition," Proc. IEEE Sensors, 2011, pp. 647-650.
  20. S. Chiappa and D. Barber, "EEG classification using generative independent component analysis," Neurocomputing, vol. 69, 2006, pp. 769-777.
  21. T. Rocha, S. Paredes, P.D. Carvalho, and J. Henriques, "Prediction of acute hypotensive episodes by means of neural network multi-models," Computers in Biology and Medicine, vol. 41, 2011, pp. 881-890.
  22. J. Henriques and T. Rocha, "Prediction of acute hypotensive episodes using neural network multi-models," Computers in Cardiology, vol. 36, 2009, pp. 549-552.
  23. J. Paalasmaa, D. J. Murphy, and O. Holmqvist, "Analysis of Noisy Biosignals for musical performance," Proc. IDA'12, 2012, pp. 241-252.
  24. X. Chen, D. Xu, G. Zhang, and R. Mukkamala, "Forecasting acute hypotensive episodes in intensive care patients based on a peripheral arterial blood pressure waveform," Computers in Cardiology, vol. 36, 2009, pp. 545-548.
  25. S. Papadimitriou, J. Sun, and C. Faloutsos, "Streaming Pattern Discovery in Multiple Time-Series," Proc. VLDB'05, 2005, pp. 697-708.
  26. E. Adams, B. Walczak, C. Vervaet, P. G. Risha, and D. L. Massart, "Principal component analysis of dissolution data with missing elements," International Journal of Pharmaceutics, vol. 234, 2002, pp. 169-178.
  27. S. Roweis, "EM algorithms for PCA and SPCA," Proc. NIPS'97, 1997, pp. 626-632.
  28. L. Smith, A Tutorial on Principal Components Analysis, Cornell University, USA, 2002.

피인용 문헌

  1. Machine Learning and Decision Support in Critical Care vol.104, pp.2, 2016,