DOI QR코드

DOI QR Code

A Study on Method for User Gender Prediction Using Multi-Modal Smart Device Log Data

스마트 기기의 멀티 모달 로그 데이터를 이용한 사용자 성별 예측 기법 연구

  • Kim, Yoonjung (Department of Industrial Engineering, Seoul National University) ;
  • Choi, Yerim (Department of Industrial Engineering, Seoul National University) ;
  • Kim, Solee (Department of Industrial Engineering, Seoul National University) ;
  • Park, Kyuyon (Department of Industrial Engineering, Seoul National University) ;
  • Park, Jonghun (Department of Industrial Engineering, Seoul National University)
  • Received : 2016.01.12
  • Accepted : 2016.02.19
  • Published : 2016.02.28

Abstract

Gender information of a smart device user is essential to provide personalized services, and multi-modal data obtained from the device is useful for predicting the gender of the user. However, the method for utilizing each of the multi-modal data for gender prediction differs according to the characteristics of the data. Therefore, in this study, an ensemble method for predicting the gender of a smart device user by using three classifiers that have text, application, and acceleration data as inputs, respectively, is proposed. To alleviate privacy issues that occur when text data generated in a smart device are sent outside, a classification method which scans smart device text data only on the device and classifies the gender of the user by matching text data with predefined sets of word. An application based classifier assigns gender labels to executed applications and predicts gender of the user by comparing the label ratio. Acceleration data is used with Support Vector Machine to classify user gender. The proposed method was evaluated by using the actual smart device log data collected from an Android application. The experimental results showed that the proposed method outperformed the compared methods.

Acknowledgement

Supported by : 한국연구재단

References

  1. Bohmer, M., Hecht, B., Schoning, J., Kruger, A., and Bauer, G., "Falling Asleep with Angry Birds, Facebook and Kindle: A Large Scale Study on Mobile Application Usage," Proceedings of the International Conference on Human Computer Interaction with Mobile Devices and Services, 2011.
  2. Baek, S. I. and Choi, D. S., "Exploring User Attitude to Information Privacy," The Journal of Society for e-Business Studies, Vol. 20, No. 1, pp. 45-59, 2015. https://doi.org/10.7838/jsebs.2015.20.1.045
  3. Brdar, S., Culibrk, D., and Crnojevic, V., "Demographic Attributes Prediction on the Real-World Mobile Data," Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.
  4. Chang, C.-C. and Lin, C.-J., "LIBSVM: A Library for Support Vector Machines," ACM Transactions on Intelligent Systems and Technology, Vol. 2, No. 3, p. 27, 2011.
  5. Chen, P.-T. and Hsieh, H.-P., "Personalized Mobile Advertising: Its Key Attributes, Trends, and Social Impact," Technological Forecasting and Social Change, Vol. 79, No. 3, pp. 543-557, 2012. https://doi.org/10.1016/j.techfore.2011.08.011
  6. Croft, W. B., Metzler, D., and Strohman, T., Search Engines: Information Retrieval in Practice, Pearson, 2009.
  7. Delany, S. J., Buckley, M., and Greene, D., "SMS Spam Filtering: Methods and Data," Expert Systems with Applications, Vol. 39, No. 10, pp. 9899-9908, 2012. https://doi.org/10.1016/j.eswa.2012.02.053
  8. Ha, S. H., Oh, J., and Lee, B. G., "The Analysis of Advertisement Effect in Smart Phone Environment: The Comparison of Users with Providers of Commercial," The Journal of Society for e-Business Studies, Vol. 16, No. 4, pp. 221-239, 2011. https://doi.org/10.7838/jsebs.2011.16.4.221
  9. Hu, J., Zeng, H.-J., Li, H., Niu, C., and Chen, Z., "Demographic Prediction based on User's Browsing Behavior," Proceedings of the International Conference on World Wide Web, 2007.
  10. Igarashi, T., Takai, J., and Yoshida, T., "Gender Differences in Social Network Development via Mobile Phone Text Messages: A Longitudinal Study," Journal of Social and Personal Relationships, Vol. 22, No. 5, pp. 691-713, 2005. https://doi.org/10.1177/0265407505056492
  11. Joachims, T., "Making Large-Scale SVM Learning Practical," in Advances in Kernel Methods-Support Vector Learning, ed Cambridge, Massachusetts: MIT Press, pp. 169-184, 1999.
  12. Kim, S., Choi, Y., Kim, Y., Park, K., and Park, J., "On-Device Gender Prediction Framework Based on the Development of Discriminative Word and Emoticon Sets," KIISE Transactions on Computing Practices, Vol. 21, No. 11, pp. 733-738, 2015. https://doi.org/10.5626/KTCP.2015.21.11.733
  13. Kuncheva, L. I., Combining Pattern Classifiers: Methods and Algorithms, John Wiley and Sons, 2004.
  14. Laurila, J. K., Gatica-Perez, D., Aad, I., Blom, J., Bornet, O., Do, T. M. T., Dousse, O., Eberle, J., and Miettinen, M., "From Big Smartphone Data to Worldwide Research: The Mobile Data Challenge," Pervasive and Mobile Computing, Vol. 9, No. 6, pp. 752-771, 2013. https://doi.org/10.1016/j.pmcj.2013.07.014
  15. Lee, D. and Shim, J., "Survey on Vector Similarity Measures: Focusing on Algebraic Characteristics," The Journal of Society for e-Business Studies, Vol. 17, No. 4, pp. 209-219, 2012. https://doi.org/10.7838/jsebs.2012.17.4.209
  16. Lee, Z., Choi, H., and Choi, S., "Study on How Service Usefulness and Privacy Concern Influence on Service Acceptance," The Journal of Society for e-Business Studies, Vol. 12, No. 4, pp. 37-51, 2007.
  17. Mohrehkesh, S., Ji, S., Nadeem, T., and Weigle, M. C., "Demographic Prediction of Mobile User from Phone Usage," Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.
  18. Roh, J.-H., Kim, H.-j., and Chang, J.-Y., "Improving Hypertext Classification Systems Through WordNet-based Feature Abstraction," The Journal of Society for e-Business Studies, Vol. 18, No. 2, pp. 95-110, 2013.
  19. Seneviratne, S., Seneviratne, A., Mohapatra, P. and Mahanti, A., "Your Installed Apps Reveal Your Gender and More!," SIGMOBILE Mobile Computing and Communications Review, Vol. 18, pp. 55-61, 2015. https://doi.org/10.1145/2721896.2721908
  20. Shim, K.-S., "MADE: Morphological Analyzer Development Environment," Journal of Internet Computing and Services, Vol. 8, No. 4, pp. 159-171, 2007.
  21. Walkowiak, K., Sztajer, S., and Wozniak, M., "Decentralized Distributed Computing System for Privacy-Preserving Combined Classifiers-Modeling and Optimization," Proceedings of the International Conference on Computational Science and Its Applications, 2011.
  22. Weiss, G. M. and Lockhart, J. W., "Identifying User Traits By Mining Smart Phone Accelerometer Data," Proceedings of the International Workshop on Knowledge Discovery from Sensor Data, 2011.
  23. Wozniak, M., Grana, M., and Corchado, E., "A Survey of Multiple Classifier Systems as Hybrid Systems," Information Fusion, Vol. 16, pp. 3-17, 2014. https://doi.org/10.1016/j.inffus.2013.04.006
  24. Ying, J. J.-C., Chang, Y.-J., Huang, C.-M. and Tseng, V. S., "Demographic Prediction based on Users Mobile Behaviors," Proceedings of Mobile Data Challenge by Nokia Workshop, 2012.
  25. Zenobi, G. and Cunningham, P., "Using Diversity in Preparing Ensembles of Classifiers based on Different Feature Subsets to Minimize Generalization Error," Proceedings of the European Conference on Machine Learning, 2001.
  26. Zhong, E., Tan, B., Mo, K., and Yang, Q., "User Demographics Prediction Based on Mobile Data," Pervasive and Mobile Computing, Vol. 9, No. 6, pp. 823-837, 2013. https://doi.org/10.1016/j.pmcj.2013.07.009