Advanced SearchSearch Tips
PLS Path Modeling to Investigate the Relations between Competencies of Data Scientist and Big Data Analysis Performance : Focused on Kaggle Platform
facebook(new window)  Pirnt(new window) E-mail(new window) Excel Download
 Title & Authors
PLS Path Modeling to Investigate the Relations between Competencies of Data Scientist and Big Data Analysis Performance : Focused on Kaggle Platform
Han, Gyeong Jin; Cho, Keuntae;
  PDF(new window)
This paper focuses on competencies of data scientists and behavioral intention that affect big data analysis performance. This experiment examined nine core factors required by data scientists. In order to investigate this, we conducted a survey to gather data from 103 data scientists who participated in big data competition at Kaggle platform and used factor analysis and PLS-SEM for the analysis methods. The results show that some key competency factors have influential effect on the big data analysis performance. This study is to provide a new theoretical basis needed for relevant research by analyzing the structural relationship between the individual competencies and performance, and practically to identify the priorities of the core competencies that data scientists must have.
Big Data;Data Scientist;Behavioral Intention;PLS-SEM;SmartPLS;
 Cited by
Chiang, R. H., Goes, P., and Stohr, E. A. (2012), Business intelligence and analytics education, and program development : A unique opportunity for the information systems discipline, ACM Transactions on Management Information Systems(TMIS), 3(3), 12.

Chiang, R. M., Kauffman, R. J., and Kwon, Y. (2014), Understanding the paradigm shift to computational social science in the presence of big data, Decision Support Systems, 63, 67-80. crossref(new window)

Chin, W. W. (1998), The partial least squares approach to structural equation modeling, Modern Methods for Business Research, 295(2), 295-336.

Cho, S. G., Cho, J., and Kim, S. B. (2015), Discovering meaningful trends in the inaugural addresses of United States presidents via text mining, Journal of the Korean Institute of Industrial Engineers, 41(5), 453-460. crossref(new window)

Cho, W.-S. (2013), A study on the education and training methods of Data scientist, Science and Technology Policy, 23(3), 44-55.

Cohen, J. (1977), Statistical power analysis for the behavioral sciences, Lawrence Erlbaum Associates, Inc.

Conway, D. (2010), The data science venn diagram, Dataists, Retrieved February, 9, 2012 (

Davenport and Thomas, H. (2012), The human side of big data and Highperformance analytics, International Institute for Analytics (

Dhar, V. (2013), Data science and prediction, Communications of the ACM, 56(12), 64-73.

Dino, M. J. S. and de Guzman, A. B. (2015), Using partial least squares (PLS) in predicting behavioral intention for telehealth use among filipino elderly, Educational Gerontology, 41(1), 53-68. crossref(new window)

Dinter, B., Douglas, D., Chiang, R. H., Mari, F., Ram, S., and Schoder, D. (2014), Big data panel at SIGDSS Pre-ICIS 2013 : A Swiss-army knife? the profile of a data scientist, Reshaping Society through Analytics, Collaboration, and Decision Support : Role of Business Intelligence and Social Media, 18, 7.

Fenn, J. and LeHong, H. (2011), Hype cycle for emerging technologies, Gartner.

Hair, J. F., Sarstedt, M., Pieper, T. M., and Ringle, C. M. (2012), The use of partial least squares structural equation modeling in strategic management research : a review of past practices and recommendations for future applications, Long Range Planning, 45(5), 320-340. crossref(new window)

Hair Jr, J. F., Hult, G. T. M., Ringle, C., and Sarstedt, M. (2013), A primer on partial least squares structural equation modeling (PLSSEM), Sage Publications.

Hollis, C. (2011), IDC digital universe study : big data is here, now what.

Jung, H. and Song, S.-K. (2012), Strategy for cultivating talent in the world of big data, Journal of Internet Computing and Services, 13(3), 45-50.

Kart, L., Heudecker, N., and Buytendijk, F. (2013), Survey analysis : big data adoption in 2013 shows substance behind the hype, Gartner Report GG0255160.

Kim, M. and Koo, P. (2013), A study on big data based investment strategy using internet search trends, Journal of the Korean Operations Research and Management Science Society, 38(4), 53-63. crossref(new window)

Kim, S. W., Kim, G. G., and Yoon, B. K. (2014), A study on a way to utilize big data analytics in the defense area, Journal of the Korean Operations Research and Management Science Society, 39(2), 1-20.

Laney, D. and Kartpaper, L. (2012), Emerging role of the data scientist and the art of data science, Gartner Inc, Stamford.

LaValle, S., Lesser, E., Shockley, R., Hopkins, M. S., and Kruschwitz, N. (2013), Big data, analytics and the path from insights to value, MIT Sloan Management Review, 21, 20-32.

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., and Roxburgh, C. (2011), Big data : The next frontier for innovation, competition, and productivity, McKinsey Global Institute.

Martinez, M. G. and Walton, B. (2014), The wisdom of crowds : The potential of online communities as a tool for data analysis, Technovation, 34(4), 203-214. crossref(new window)

Nomura Research Institute (2012), The era of big data, IT Solutions Frontier.

Nunnally, J. C. and Bernstein, I. H. (1994), Psychometric theory, New York : McGraw-Hill.

Pantai, K. L. (2012), PLS path model for testing the moderating effects in the relationships among formative IS usage variables of academic digital libraries, Australian Journal of Basic and Applied Sciences, 6(7), 365-374.

Patil, D. J. (2011), Building data science teams, O'Reilly Media, Inc.

Patil, D. J. and Davenport, T. H. (2012), Data scientist, Harvard Business Review, 90, 70-76.

Rahul, D. (2012), Data/Web Analyst vs. Data Scientist (

Rauser, J. (2011), What is data scientist? (

Tenenhaus, M., Vinzi, V. E., Chatelin, Y. M., and Lauro, C. (2005), PLS path modeling, Computational statistics and data analysis, 48(1), 159-205. crossref(new window)

Thorp, J. (2003), The information paradox : realizing the business benefits of information technology, McGraw-Hill Ryerson.

Venkatesh, V., Morris, M. G., Davis, G. B., and Davis, F. D. (2003), User acceptance of information technology : Toward a unified view, MIS Quarterly, 27(3), 425-478. crossref(new window)

Vidgen, R. (2014), Creating business value from big data and business analytics : organizational, managerial and human resource implications (

Wamba, S. F., Akter, S., Edwards, A., Chopin, G., and Gnanzou, D. (2015), How 'big data' can make big impact : Findings from a systematic review and a longitudinal case study, International Journal of Production Economics, 165, 234-246. crossref(new window)

Will Cukierski (2015), Improved Kaggle Rankings (