DOI QR코드

DOI QR Code

Developing of New a Tensorflow Tutorial Model on Machine Learning : Focusing on the Kaggle Titanic Dataset

텐서플로우 튜토리얼 방식의 머신러닝 신규 모델 개발 : 캐글 타이타닉 데이터 셋을 중심으로

  • Received : 2019.05.07
  • Accepted : 2019.07.02
  • Published : 2019.08.31

Abstract

The purpose of this study is to develop a model that can systematically study the whole learning process of machine learning. Since the existing model describes the learning process with minimum coding, it can learn the progress of machine learning sequentially through the new model, and can visualize each process using the tensor flow. The new model used all of the existing model algorithms and confirmed the importance of the variables that affect the target variable, survival. The used to classification training data into training and verification, and to evaluate the performance of the model with test data. As a result of the final analysis, the ensemble techniques is the all tutorial model showed high performance, and the maximum performance of the model was improved by maximum 5.2% when compared with the existing model using. In future research, it is necessary to construct an environment in which machine learning can be learned regardless of the data preprocessing method and OS that can learn a model that is better than the existing performance.

Keywords

References

  1. 'Science and technology policy', http://www.stepi.re.kr/app/publish/view.jsp?cmsCd=CM0021&cateCd=A0504&ntNo=211&sort=PUBDATE&sdt=&edt=&src=&srcTemp=&opt=N&currtPg=1, accessed 07.01, 2019(in Korean)
  2. M.J. Jang, S.T. Kim, "Neural Network-based FMCW Radar System for Detecting a Drone," IEMEK J. Embed. Sys. Appl., Vol. 13, No. 6, pp. 289-296, 2018(in Korean). https://doi.org/10.14372/IEMEK.2018.13.6.289
  3. J.H. Kim, D.S. Lee, M.H. Lee, "Lane Datation, System Using CNN," IEMEK J. Embed. Sys. Appl., Vol. 11, No. 3, pp. 163-171, 2016(in Korean). https://doi.org/10.14372/IEMEK.2016.11.3.163
  4. K.P. Murphy, Machine Learning a probabilistic perspective, MIT Press, 2012.
  5. 'Artificial intelligence', http://en.wikipedia.org/wiki/Artificial_intelligence, accessed 07.01, 2019.
  6. 'Machine learning', https://en.wikipedia.org/wiki/Machine_learning, accessed 07.01, 2019.
  7. M.S. Heo, My first Machine Learning/Deep Learning, Wikibooks, 2019.
  8. 'A Study on Big Data Analysis and Prediction Model based on Machine Learning', http://repository.kihasa.re.kr:8080/handle/201002/29093, accessed 07.01, 2019(in Korean).
  9. 'Google's acquisition of kaggle, the impact on the AI field?', http://www.ciokorea.com/news/33510, accessed 07.01, 2019(in Korean).
  10. 'Data analysis starting with Kaggle and Titanic research', https://developers.ascentnet.co.jp/2017/11/24/kaggle-process-review/, accessed 07.01, 2019.
  11. 'Titanic: Machine Learning from Disaster', https://www. kaggle.com/c/titanic, accessed 07.01, 2019.
  12. B. Henrik, R. Joseph, F. Mark, Real-World Machine Learning, Wikibooks, 2017.
  13. D.M. Beazly, Python Essential Reference, Insight, 2012.
  14. 'scikit learn', https://scikit-learn.org/stable/, accessed 07.01, 2019.
  15. P. Gramatica, P. Pilutti, E. Papa, "Validated QSAR Prediction of OH Tropospheric Degradation of VOCs: Splitting into Training-Test Sets and Consensus Modeling," Journal of Chemical Information and Modeling, Vol. 44, No. 5, pp. 1794-182, 2004.
  16. 'Artificial neural network', https://en.wikipedia.org/wiki/Artificial_neural_network, accessed 07.01, 2019.
  17. 'Neural network', https://en.wikipedia.org/wiki/Neural_network, accessed 07.01, 2019.
  18. 'Perceptron', https://en.wikipedia.org/wiki/Perceptron, accessed 07.01, 2019.
  19. 'Multilayer perceptron', https://en.wikipedia.org/wiki/Multilayer_perceptron, accessed 07.01, 2019.
  20. 'k-nearest neighbors algorithm', https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm, accessed 07.01, 2019.
  21. 'Naive Bayes classifier', http://en.wikipedia.org/wiki/Naive_Bayes_classifier, accessed 07.01, 2049.
  22. 'Decision tree', https://en.wikipedia.org/wiki/Decision_tree, accessed 07.01, 2019.
  23. 'Random forest', https://en.wikipedia.org/wiki/Random_forest, accessed 07.01, 2019.
  24. 'Gradient boostin', https://en.wikipedia.org/wiki/Gradient_boosting, accessed 07.01, 2019.
  25. 'XGBoost', https://en.wikipedia.org/wiki/XGBoost, accessed 07.01, 2019.
  26. 'Ensemble learning', https://en.wikipedia.org/wiki/Ensemble_learning, accessed 07.01, 2019.
  27. 'Bootstrap aggregating', https://en.wikipedia.org/wiki/Bootstrap_aggregating, accessed 07.01, 2019.
  28. 'Support-vector machine', https://en.wikipedia.org/wiki/Support-vector_machine, accessed 07.01, 2019.
  29. 'Logistic regression', https://en.wikipedia.org/wiki/Logistic_regression, accessed 07.01, 2019.
  30. D.M.W. Powers, "Evaluation: From Informedness, Markedness and Correlation," Journal of Machine Learning Technology, Vol. 2, No. 1, pp. 37-63, 2011.
  31. T. Fawcett, "An Introduction to ROC Analysis," Pattern Recognition Letters, Vol. 27, No. 8, pp. 861-874, 2006. https://doi.org/10.1016/j.patrec.2005.10.010
  32. 'Receiver operating characteristic', https://en.wikipedia.org/wiki/Receiver_operating_characteristic, accessed 07.01, 2019.
  33. 'Confusion matrix', https://en.wikipedia.org/wiki/Confusion_matrix', accessed 07.01, 2019.
  34. 'Titanic Data Science Solution', https://www.kaggle.com/startupsci/titanic-data-science-solutions, accessed 07.01, 2019.
  35. 'An Interactive Data Science Tutorial', https://www.kaggle.com/helgejo/an-interactive-data-science-tutorial, accessed 07.01, 2019.
  36. 'Machine Learning form Start to Finish with Scikit-Learn', https://www.kaggle.com/jeffd23/scikit-learn-ml-from-start-to-finish, accessed 07.01, 2019.
  37. 'XGBoost example', https://www.kaggle.com/datacanary/xgboost-example-python, accessed 07.01, 2019.
  38. 'Introduction to Ensembling/Stacking in Python', http://www/kaggle.com/arthurtok/introduction-to-ensembling-stacking-in-python, accessed 07.01, 2019.
  39. 'Hyperparameter optimization', https://en.wikipedia.org/wiki/Hyperparameter_optimization, accessed 07.01, 2019.
  40. 'Developing of New a Machine-Learning Tutorial Model', https://github.com/dgkim1108/Machine-Learning, accessed 07.01, 2019.