DOI QR코드

DOI QR Code

Improved Tweet Bot Detection Using Spatio-Temporal Information

시공간 정보를 사용한 개선된 트윗 봇 검출

  • Kim, Hyo-Sang (Department of Mobile Systems Engineering, Dankook University) ;
  • Shin, Won-Yong (Department of Computer Science and Engineering, Dankook University) ;
  • Kim, Donggeon (Department of Statistics and Information Science, Dongduk Women's University) ;
  • Cho, Jaehee (Department of Management, Kwangwoon University)
  • Received : 2015.08.10
  • Accepted : 2015.09.18
  • Published : 2015.12.31

Abstract

Twitter, one of online social network services, is one of the most popular micro-blogs, which generates a large number of automated programs, known as tweet bots because of the open structure of Twitter. While these tweet bots are categorized to legitimate bots and malicious bots, it is important to detect tweet bots since malicious bots spread spam and malicious contents to human users. In the conventional work, temporal information was utilized for the classficiation of human and bot. In this paper, by utilizing geo-tagged tweets that provide high-precision location information of users, we first identify both Twitter users' exact location and the corresponding timestamp, and then propose an improved two-stage tweet bot detection algorithm by computing an entropy based on spatio-temporal information. As a main result, the proposed algorithm shows superior bot detection and false alarm probabilities over the conventional result which only uses temporal information.

온라인 소셜 네트워크 서비스 중 하나인 트위터는 가장 보편적으로 사용되는 마이크로 블로그인데, 트위터의 개방적 구조로 인해 자동화 프로그램인 트윗 봇이 많이 생성되고 있다. 이 트윗 봇은 적법한 봇과 악성 봇으로 분류되는데, 이 중 악성 봇은 일반 사용자들에게 많은 양의 스팸 정보나 유해한 컨텐츠를 배포하기 때문에 트윗 봇을 검출하는 작업은 반드시 필요하다. 기존 연구에서는 시간적 정보를 활용하여 사람과 트윗 봇을 분류하였다. 본 논문에서는 사용자들의 고 정밀 위치 정보를 알려주는 공간 태그된 트윗 정보를 활용하여 트위터 사용자들의 정확한 위치와 트윗 전송시각을 알아낸 후, 각 사용자의 시공간 엔트로피를 계산하여 트윗 봇을 검출하는 개선된 두 단계 알고리즘을 제안한다. 주요 결과로써, 시간 정보만을 이용한 기존 연구결과보다 각 신뢰도별 봇 검출 확률 및 거짓 경보 확률이 모두 우수하게 나타난다.

Keywords

References

  1. C. Wilson, B. Boe, A.Sala, K. P. N. Puttaswamy, and B. Y. Zhao, "User interaction in social networks and their implication," in Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys '09), Nuremberg, Germany, pp. 205-218, Mar./Apr. 2009.
  2. H. Kwak, C. Lee, H. Park, and S. Moon, "What is Twitter, a social network or a news media?," in Proceedings of the 19th International World Wide Web Conference (WWW2010), Raleigh, NC USA, pp. 591-600, Apr. 2010.
  3. M. C. Gonzalez, C. A. Hidalgo, and A. L. Batabasi, "Understanding individual human mobility patterns," Nature, vol. 453, pp. 591-600, Apr. 2010.
  4. D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L. Barabasi, "Human mobility, social ties, and link prediction," in Proceedings of the 17th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining (KDD2011), San Diego, CA USA, pp.1100-1108, Aug. 2011.
  5. B. Hawelka, I. Sitko, E. Beinat, S. Sobolevsky, P. Kazakopoulos, and C. Ratti, "Geo-located Twitter as proxy for global mobility patterns," Cartography and Geographic Information Science, vol. 41, no. 3, pp. 260-271, May 2014. https://doi.org/10.1080/15230406.2014.890072
  6. R. Jurdak, K. Zhao, J. Liu, M. AbouJaoude, M. Cameron, and D. Newth, "Understanding human mobility from Twitter," PLOS ONE, vol. 10, no. 7, pp. 1-16, July 2015.
  7. W.-Y. Shin, B. C. Singh, J. Cho, and A. M. Everett, "A new understanding of friendships in space: Complex networks meet Twitter," Journal of Information Science, vol. 41, no. 6, pp. 751-564, Dec. 2015. https://doi.org/10.1177/0165551515600136
  8. S. Y. Jeon, A. C. Lee, G. E. Seo, and W. Y. Shin, "Relationship between tweet frequency and user velocity on Twitter," Journal of the Korea Institute of Information and Communication Engineering, vol. 19, no. 6, pp. 1380-1386, Jun. 2015. https://doi.org/10.6109/jkiice.2015.19.6.1380
  9. Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia, "Detecting automation of Twitter accounts: Are you a human, bot, or cyborg?," IEEE Transactions on Dependable and Secure Computing, vol. 9, no.6, pp. 811-824, Dec. 2012. https://doi.org/10.1109/TDSC.2012.75