Network Traffic Measurement Analysis using Machine Learning

Hae-Duck Joshua Jeong;

doi:10.24225/kjai.2023.11.2.19

Korean Journal of Artificial Intelligence (한국인공지능학회지)

Volume 11 Issue 2
/
Pages.19-27
/
2023
/
2508-7894(eISSN)

Korea Artificial Intelligence Association (한국인공지능학회)

DOI QR Code

Network Traffic Measurement Analysis using Machine Learning

Hae-Duck Joshua Jeong (Dept. of Computer Software, Korean Bible University)

Received : 2023.05.06
Accepted : 2023.06.02
Published : 2023.06.30

https://doi.org/10.24225/kjai.2023.11.2.19 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In recent times, an exponential increase in Internet traffic has been observed as a result of advancing development of the Internet of Things, mobile networks with sensors, and communication functions within various devices. Further, the COVID-19 pandemic has inevitably led to an explosion of social network traffic. Within this context, considerable attention has been drawn to research on network traffic analysis based on machine learning. In this paper, we design and develop a new machine learning framework for network traffic analysis whereby normal and abnormal traffic is distinguished from one another. To achieve this, we combine together well-known machine learning algorithms and network traffic analysis techniques. Using one of the most widely used datasets KDD CUP'99 in the Weka and Apache Spark environments, we compare and investigate results obtained from time series type analysis of various aspects including malicious codes, feature extraction, data formalization, network traffic measurement tool implementation. Experimental analysis showed that while both the logistic regression and the support vector machine algorithm were excellent for performance evaluation, among these, the logistic regression algorithm performs better. The quantitative analysis results of our proposed machine learning framework show that this approach is reliable and practical, and the performance of the proposed system and another paper is compared and analyzed. In addition, we determined that the framework developed in the Apache Spark environment exhibits a much faster processing speed in the Spark environment than in Weka as there are more datasets used to create and classify machine learning models.

Keywords

References

Abbasi, A, Shahraki, A & Taherkordi, A. (2021). Deep Learning for Network Traffic Monitoring and Analysis (NTMA): A Survey, Computer Communications, 170, 19-41. https://doi.org/10.1016/j.comcom.2021.01.021
Almomani, O., Almaiah, M. A., Alsaaidah, A., Smadi, S., Mohammad, A. H., & Althunibat, A. (2021, July). Machine learning classifiers for network intrusion detection system: comparative study. In 2021 International Conference on Information Technology (ICIT) (pp. 440-445).
Alqudah, N., & Yaseen, Q. (2020). Machine Learning for Traffic Analysis: A Review, Procedia Computer Science, 170, 911-916. https://doi.org/10.1016/j.procs.2020.03.111
Barford, P., & Plonka, D. (2001). Characteristics of Network Traffic Flow Anomalies. Proc. 1st ACM SIGCOMM Workshop on Internet Measurement, San Francisco, California, USA, 69-73.
Bell, J. (2015). Machine Learning (Indianapolis, IN: John Wiley & Sons, Inc.).
Casas, P., Vanerio, J. & Fukuda, K. (2017). "GML learning, a generic machine learning model for network measurements analysis," 2017 13th International Conference on Network and Service Management (CNSM), Tokyo, Japan, 1-9.
Choudhary, S., & Kesswani, N. (2020). Analysis of KDD-Cup'99, NSL-KDD and UNSW-NB15 datasets using deep learning in IoT. Procedia Computer Science, 167, 1561-1573. https://doi.org/10.1016/j.procs.2020.03.367
Cortes, C., & Vapnik, V. (1995). Support-Vector Networks, Machine Learning, 20(3), 273-297.
Gitau, J.M., Rodrigues, A.J., & Abuonji, P. (2020). Prototype Intelligent Log-Based Intrusion Detection System, International Journal of Advanced Networking and Applications, 12, 4519-4527. https://doi.org/10.35444/IJANA.2020.12102
Gurung, S., Ghose, M. K., & Subedi, A. (2019). Deep learning approach on network intrusion detection system using NSLKDD dataset. International Journal of Computer Network and Information Security, 11(3), 8-14. https://doi.org/10.5815/ijcnis.2019.03.02
Jeong, H.-D., Ahn, W., Kim, H., & Lee, J.-S.R. (2017). Anomalous Traffic Detection Self-Similarity Analysis in the Environment of ATMSim, Cryptography, 1(3), 1-19.
Jeong, H.-D.J., Ryu, M.-U., Ji, M. -J., Cho, Y. -B., Ye, S. -K., & Lee, J.-S.R. (2016). DDoS Attack Analysis Using the Improved ATMSim, Journal of Internet Computing and Services, 17(2), 19-28. https://doi.org/10.7472/JKSII.2016.17.2.19
Kang, M., & Choi, E. (2021). Machine Learning: Concepts, Tools and Data Visualization, World Scientific.
Kelleher, J.D., Namee, B.M., & D'Arcy, A. (2014). Fundamentals of Machine Learning for Predictive Data Analysis: Algorithms, Worked Examples, and Case Studies (Cambridge, MA: The MIT Press).
Khan, K., & Goodridge, W. (2019). A Survey of Network-based Security Attacks, International Journal of Advanced Networking and Applications, 10(5), 3981-3989. https://doi.org/10.35444/IJANA.2019.10051
Kim, K.-P., & Song, S.-W. (2018). A Study on Prediction of Business Status Based on Machine Learning. Korea Journal of Artificial Intelligence, 6(2), 23-27. https://doi.org/10.24225/KJAI.2018.6.2.23.
Kulariya, M., Saraf, P., Ranjan, R., & Gupta, G. P. (2016). Performance analysis of network intrusion detection schemes using Apache Spark. In 2016 International Conference on Communication and Signal Processing (ICCSP) (pp. 1973-1977).
Lee, J.-S., Ye, S.-K., & Jeong, H.-D. (2014). ATMSim: An Anomaly Teletraffic Detection Measurement Analysis Simulator, Simulation Modeling Practice and Theory, 49, 98-109. https://doi.org/10.1016/j.simpat.2014.09.001
Murphy, K.P. (2012). Machine Learning: A Probabilistic Perspective (Cambridge, Massachusetts: The MIT Press).
Pakdel, R. (2019). Cloud-based Machine Learning Architecture for Big Data Analysis, PhD thesis, National University of Ireland, Cork.
Parihar, V., & Yadav, S. (2022). Comparative Analysis of Different Machine Learning Algorithms to Predict Online Shoppers' Behaviour, International Journal of Advanced Networking and Applications, 13(6), 5169-5182. https://doi.org/10.35444/IJANA.2022.13603
Pentreath, N. (2015). Machine Learning with Spark, (Packt Publishing, London).
Perveen, S., Shahbaz, M., Guergachi, A., & Keshavjee, K. (2016). Performance Analysis of Data Mining Classification Techniques to Predict Diabetes, Procedia Computer Science, 82, 115-121. https://doi.org/10.1016/j.procs.2016.04.016
Saranya, T., Sridevi, S., Deisy, C., Chung, T. D., & Khan, M. A. (2020). Performance analysis of machine learning algorithms in intrusion detection system: A review. Procedia Computer Science, 171, 1251-1260. https://doi.org/10.1016/j.procs.2020.04.133
Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A.A. (2009). A detailed analysis of the KDD CUP 99 data set, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 1-6.
Yuan, R., Li, Z., Guan, X. & Li, X. (2010). An SVM-based machine learning method for accurate internet traffic classification. Information Systems Frontiers, 12, 149-156. https://doi.org/10.1007/s10796-008-9131-2
Witten, I.H., & Frank, E. (2002). Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, ACM SIGMOD Record, 31(1), 76-77. https://doi.org/10.1145/507338.507355
Witten, I.H., Frank, E., & Hall, M.A. (2011). Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition (New York, NY, Morgan Kaufmann).

Korean Journal of Artificial Intelligence (한국인공지능학회지)

Network Traffic Measurement Analysis using Machine Learning

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)