DOI QR코드

DOI QR Code

Ensemble of Degraded Artificial Intelligence Modules Against Adversarial Attacks on Neural Networks

  • Received : 2018.03.30
  • Accepted : 2018.06.20
  • Published : 2018.09.30

Abstract

Adversarial attacks on artificial intelligence (AI) systems use adversarial examples to achieve the attack objective. Adversarial examples consist of slightly changed test data, causing AI systems to make false decisions on these examples. When used as a tool for attacking AI systems, this can lead to disastrous results. In this paper, we propose an ensemble of degraded convolutional neural network (CNN) modules, which is more robust to adversarial attacks than conventional CNNs. Each module is trained on degraded images. During testing, images are degraded using various degradation methods, and a final decision is made utilizing a one-hot encoding vector that is obtained by summing up all the output vectors of the modules. Experimental results show that the proposed ensemble network is more resilient to adversarial attacks than conventional networks, while the accuracies for normal images are similar.

Keywords

References

  1. I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples," 2014 [Internet], Available: https://arxiv.org/abs/1412.6572.
  2. N. Carlini and D. Wagner, "Towards evaluating the robustness of neural networks," in Proceedings of IEEE Symposium on Security and Privacy, San Jose, CA, pp. 39-57, 2017. DOI: 10.1109/SP.2017.49.
  3. N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, "The limitations of deep learning in adversarial settings," in Proceedings of IEEE European Symposium on Security and Privacy, Saarbrucken, Germany, pp. 372-387, 2016. DOI: 10.1109/EuroSP.2016.36.
  4. J. Lu, H. Sibai, E. Fabry, and D. Forsyth, "No need to worry about adversarial examples in object detection in autonomous vehicles," 2017 [Internet], Available: https://arxiv.org/abs/1707.03501.
  5. A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, "Synthesizing robust adversarial examples," 2018 [Internet], Available: https://arxiv.org/abs/1707.07397.
  6. J. Su, D. V. Vargas, and S. Kouichi, "One pixel attack for fooling deep neural networks," 2018 [Internet], Available: https://arxiv.org/abs/1710.08864.
  7. S. M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, "DeepFool: a simple and accurate method to fool deep neural networks," 2016 [Internet], Available: https://arxiv.org/abs/1511.04599.
  8. R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner. "Detecting adversarial samples from artifacts," 2017 [Internet], Available: https://arxiv.org/abs/1703.00410.
  9. X. Li and F. Li, "Adversarial examples detection in deep networks with convolutional filter statistics," 2017 [Internet], Available: https://arxiv.org/abs/1612.07767.
  10. S. Sabour, N. Frosst, and G. E. Hinton, "Dynamic routing between capsules," 2017 [Internet], Available: https://arxiv.org/abs/1710.09829.
  11. T. Strauss, M. Hanselmann, A. Junginger, and H. Ulmer, "Ensemble methods as a defense to adversarial perturbations against deep neural networks," 2018 [Internet], Available: https://arxiv.org/abs/1709.03423.
  12. W. He, J. Wei, X. Chen, N. Carlini, and D. Song, "Adversarial example defenses: ensembles of weak defenses are not strong," 2017 [Internet], Available: https://arxiv.org/abs/1706.04701.