JOURNAL BROWSE
Search
Advanced SearchSearch Tips
On-Device Gender Prediction Framework Based on the Development of Discriminative Word and Emoticon Sets
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
On-Device Gender Prediction Framework Based on the Development of Discriminative Word and Emoticon Sets
Kim, Solee; Choi, Yerim; Kim, Yoonjung; Park, Kyuyon; Park, Jonghun;
 
 Abstract
User demographic information is necessary in order to improve the quality of personalized services such as recommendation systems. Mobile data, especially text data, is known to be effective for prediction of user demographic information. However, mobile text data has privacy issues so that its utilization is limited. In this regard, we introduce an on-device gender prediction framework utilizing mobile text data while minimizing the privacy issue. Discriminative word and emoticon sets of each gender are constructed from web documents written by authors of each gender. After gender prediction is performed by comparing discriminative word and emoticon sets with a user's mobile text data, an ensemble method that combines two prediction results draws a final result. From experiments conducted on real-world mobile text data, the proposed on-device framework shows promising results for gender prediction.
 Keywords
gender prediction;text data;ensemble method;mobile user;on-device prediction;
 Language
Korean
 Cited by
1.
모바일 사용자의 성별 예측을 위한 식별 및 인기 단어 집합 기반 2단계 기기 내 분석,최예림;박규연;김소이;박종헌;

한국전자거래학회지, 2016. vol.21. 1, pp.65-77 crossref(new window)
2.
스마트 기기의 멀티 모달 로그 데이터를 이용한 사용자 성별 예측 기법 연구,김윤정;최예림;김소이;박규연;박종헌;

한국전자거래학회지, 2016. vol.21. 1, pp.147-163 crossref(new window)
1.
A Two-Phase On-Device Analysis for Gender Prediction of Mobile Users Using Discriminative and Popular Wordsets, The Journal of Society for e-Business Studies, 2016, 21, 1, 65  crossref(new windwow)
2.
A Study on Method for User Gender Prediction Using Multi-Modal Smart Device Log Data, The Journal of Society for e-Business Studies, 2016, 21, 1, 147  crossref(new windwow)
 References
1.
J. J. Ying, Y. Chang, C. Huang, and V. S. Tseng, "Demographic Prediction based on User's Mobile Behaviors," Mobile Data Challenge, Jun. 2012.

2.
T. Kucukyilmaz, B. B. Cambazoglu, C. Aykanat, and F. Can, "Chat Mining for Gender Prediction," Advanced in Information Systems, Vol. 4243, pp. 274-283, Oct. 2006.

3.
H.-J. Song, A.-Y. Kim, and S.-B. Park, "Identification of User Profile in Social Media based on Multi-Instance Learning," Journal of KIISE : Software and Applications, Vol. 40, No. 4, pp. 233-240, Apr. 2013. (in Korean)

4.
K. Ryu, J. Jeong, and S. Moon, "Inferring Sex, Age, Location of Twitter Users," Journal of KIISE, Vol. 32, No. 7, pp. 46-53, Jul. 2014. (in Korean)

5.
R. Lakoff, "Language and Woman's Place," Language in Society, Vol. 2, No. 1, pp. 45-80, 1973. crossref(new window)

6.
J.-B. Lee, "Use and Gender Differences of Onomatopoeia and Mimetic Words on Internet," The Korean Language and Literature, Vol. 62, pp. 45-74, Sep. 2014. (in Korean)

7.
H. Zhu, E. Chen, K. Yu, H. Cao, H. Xiong, and J. Tian, "Mining Personal Context-Aware Preferences for Mobile Users," Proc. of the IEEE International Conference on Data Mining, Vol. 12, Dec. 2012.

8.
H.-J. Song, S.-B. Park, and S.-J. Lee, "User Profiles Identification from Mobility and Social Media Texts," Journal of KIISE : Computing Practices and Letters, Vol. 19, No. 4, pp. 393-397, Jul. 2013. (in Korean)

9.
L. Li, M. Sun, and Z. Liu, "Discriminating Gender on Chinese Microblog: A Study of Online Behaviour, Writing Style and Preferred Vocabulary," Proc. of the International Conference on Natural Computation, pp. 812-817, Aug. 2014.

10.
S.-C. Kim and J. C. Park, "Age Prediction from Korean Tweets with Style-based Feature Analysis," Proc. of the HCI Korea 2012 Conference, pp. 177-180, Jan. 2012. (in Korean)

11.
D. Cheng, H. Song, H. Cho, S. Jeong, S. Kalasapur, and A. Messer, "Mobile Situation-Aware Task Recommendation Application," Proc. of the International Conference on Next Generation Mobile Applications, Services, and Technologies, pp. 228-233, Sep. 2008.

12.
Y. Yang and J. O. Pedersen, "A Comparative Study on Feature Selection in Text Categorization," Proc. of the International Conference on Machine Learning, pp. 412-420, Jul. 1997.

13.
S. Tata and J. M. Patel, "Estimating the Selectivity of Tf-Idf based Cosine Similarity," ACM SIGMOD Record, Vol. 24, No. 2, pp. 7-12, Jun. 2007.

14.
K. Shim, "MADE: Morphological Analyzer Development Environment," Journal of Internet Computing and Services, Vol. 8, No. 4, pp. 159-171, Aug. 2007. (in Korean)