DOI QR코드

DOI QR Code

A Manually Captured and Modified Phone Screen Image Dataset for Widget Classification on CNNs

  • Received : 2021.06.02
  • Accepted : 2021.10.10
  • Published : 2022.04.30

Abstract

The applications and user interfaces (UIs) of smart mobile devices are constantly diversifying. For example, deep learning can be an innovative solution to classify widgets in screen images for increasing convenience. To this end, the present research leverages captured images and the ReDraw dataset to write deep learning datasets for image classification purposes. First, as the validation for datasets using ResNet50 and EfficientNet, the experiments show that the dataset composed in this study is helpful for classification according to a widget's functionality. An implementation for widget detection and classification on RetinaNet and EfficientNet is then executed. Finally, the research suggests the Widg-C and Widg-D datasets-a deep learning dataset for identifying the widgets of smart devices-and implementing them for use with representative convolutional neural network models.

Keywords

References

  1. T. Akram, H. M. J. Lodhi, S. R. Naqvi, S. Naeem, M. Alhaisoni, M. Ali, S. A. Haider, and N. N. Qadri, "A multilevel features selection framework for skin lesion classification," Human-centric Computing and Information Sciences, vol. 10, article no. 12, 2020. https://doi.org/10.1186/s13673-020-00216-y
  2. D. Cao, Z. Chen, and L. Gao, L. (2020). An improved object detection algorithm based on multi-scaled and deformable convolutional neural networks. Human-centric Computing and Information Sciences, vol. 10, article no. 14, 2020. https://doi.org/10.1186/s13673-020-00219-9
  3. J. Lee and K. I. Hwang, "RAVIP: real-time AI vision platform for heterogeneous multi-channel video stream," Journal of Information Processing Systems, vol. 17, no. 2, pp. 227-241, 2021. https://doi.org/10.3745/JIPS.02.0154
  4. S. Shokat, R. Riaz, S. S. Rizvi, A. M. Abbasi, A. A. Abbasi, and S. J. Kwon, "Deep learning scheme for character prediction with position-free touch screen-based Braille input method," Human-centric Computing and Information Sciences, vol. 10, article no. 41, 2020. https://doi.org/10.1186/s13673-020-00246-6
  5. S. D. You, C. H. Liu, and W. K. Chen, W. K. (2018). Comparative study of singing voice detection based on deep neural networks and ensemble learning. Human-centric Computing and Information Sciences, vol. 8, article no. 34, 2018. https://doi.org/10.1186/s13673-018-0158-1
  6. K. Moran, C. Bernal-Cardenas, M. Curcio, R. Bonett, and D. Poshyvanyk, "Machine learning-based prototyping of graphical user interfaces for mobile apps," IEEE Transactions on Software Engineering, vol. 46, no. 2, pp. 196-221, 2018. https://doi.org/10.1109/tse.2018.2844788
  7. M. Tan and Q. Le, "Efficientnet: rethinking model scaling for convolutional neural networks," Proceedings of Machine Learning Research, vol. 97, pp. 6105-6114, 2019.
  8. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778.
  9. H. Han, "Residual learning based CNN for gesture recognition in robot interaction," Journal of Information Processing Systems, vol. 17, no. 2, pp. 385-398, 2021. https://doi.org/10.3745/JIPS.01.0072
  10. T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2999-3007.
  11. BoundingBoxerImg [Online]. Available: https://github.com/jms0923/BoundingBoxerImg.
  12. M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (VOC) challenge," International Journal of Computer Vision, vol. 88, no. 2, pp. 303-338, 2010. https://doi.org/10.1007/s11263-009-0275-4
  13. M. Aamir, Y. F. Pu, W. A. Abro, H. Naeem, and Z. Rahman, "A hybrid approach for object proposal generation," in The Proceedings of the International Conference on Sensing and Imaging. Cham, Switzerland: Springer, 2017, pp. 251-259.
  14. M. Aamir, Y. F. Pu, Z. Rahman, W. A. Abro, H. Naeem, F. Ullah, and A. M. Badr, "A hybrid proposed framework for object detection and classification," Journal of Information Processing Systems, vol. 14, no. 5, pp. 1176-1194, 2018. https://doi.org/10.3745/JIPS.02.0095
  15. Y. Guan, M. Aamir, Z. Hu, W. A. Abro, Z. Rahman, Z. A. Dayo, and S. Akram, "A region-based efficient network for accurate object detection," Traitement du Signal, vol. 38, no. 2, pp. 481-494, 2021. https://doi.org/10.18280/ts.380228
  16. G. Hinton, N. Srivastava, and K. Swersky, "Neural Networks for Machine Learning: overview of minibatch gradient descent (Lecture 6a)," [Online]. Available: http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf.
  17. D. P. Kingma and J. Ba, "Adam: a method for stochastic optimization," 2014 [Online]. Available: https://arxiv.org/abs/1412.6980.