Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market

데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례

  • Lee, Seon Ah (Technology Division, WAVUS Co., Ltd.) ;
  • Chang, Namsik (College of Business Administration, University of Seoul)
  • 이선아 (주식회사 웨이버스 기술본부) ;
  • 장남식 (서울시립대학교 경영대학)
  • Received : 2014.11.25
  • Accepted : 2015.01.14
  • Published : 2015.03.31


With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.


Supported by : 서울시립대학교


  1. Cha, K. Y., "An Application of Data-Mining Tool in Fraud Pension Payment Prediction," Communications for Statistical Applications and Methods, Vol.17, No.1(2010), 1-8.
  2. Chang, N., "Improving the Effect of Customer Classification Models: A Pre-segmentation Approach," Information Systems Review, Vol.7, No.2(2005), 23-40.
  3. Chang, N., S. W. Hong, and J. H. Jang, Data Mining, Daecheong, 1999.
  4. Choi, S.-H., J.-W. Kim, K.-R. Kim, and Y. S. Lee, "A Study on the Problem and Improvement of Farm Product Structure in Korea," Journal of Franchise Management, Vol.2, No.2(2011), 70-83.
  5. Egmarket, Distributor's Role, Available at (Accessed 20 September, 2014).
  6. Garak, Market Function, Available at (Accessed 18 August, 2014).
  7. Ham, S. O. and J. S. Hong, "A Study on the Fraud Detection of Industrial Accident Compensation Insurance," Proceedings of 2008 KORMS Fall Conference, (2008), 342-345.
  8. Jeong, C. S., "A Study on the Agricultural Product Market: The Case of Vegetable Products," Master's Thesis, Department of Economics, Kyung Hee University, 2000.
  9. Kim, D. W., J. W. Song, D. S. Kim, J. H. Park, H. N. Park and Y. R. Lee, "Improving Sales Efforts of Intermediary Wholesaler in Garak Market," Research Report, Seoul Agro-Fisheries & Food Corporation, 2009.
  10. Kim, T.-H and Y.-H. Kim, "A Study on the Analysis of Customer Loan for the Credit Finance Company Using Classification Model," Journal of the Korean Data & Information Science Society, Vol.24, No.3(2013), 411-425.
  11. Lee, S. A., "A Study on the Fraud Detection using Data Mining: The Case of Agricultural Products Distribution Market," Master's Thesis, College of Business Administration, University of Seoul, 2013.
  12. McKinsey Global Institute, "Big Data: The Next Frontier for Innovation, Competition, and Productivity," McKinsey and Company, 2011.
  13. Park, J., "Real-time Data Integration using Ontology and Semantic Mediators," Asia Pacific Journal of Information Systems, Vol. 16, No.4(2006), 151-178.
  14. Rho, B. H., J. H. Min, and G. H. Lee, Introduction to Statistics, Bobmunsa, 1998.
  15. Seo, K. N. and S. R. Yang, "The Effect of the Electronic Auction on the Price Efficiency in the Garak Market," Korean Journal of Agricultural Management and Policy, Vol.38, No.2(2011), 175-195.
  16. Sha, D. C., "The Legislation on the Stability of Supply and Reform of Circulation Structure on Agricultural Products," Hongik Law Review, Vol.12, No.2(2011), 167-193.
  17. Song, Y., W. Han and W. C. Jhee, "Ensemble Size Reduction in Fraud Detection System," Proceedings of 2007 KMIS International Conference, (2007), 597-602.
  18. Sung, T. K., N. Chang, and G. Lee, "Dynamics of Modeling in Data Mining: Interpretive Approach to Bankruptcy Prediction," Journal of Management Information Systems, Vol. 16, No.1(1999), 63-85.
  19. Stubbs, E., Big Data, Big Innovation, Wiley, 2014.
  20. Tam, K. Y., and M. Y. Kiang, "Managerial Applications of Neural Networks: The Case of Bankruptcy Predictions," Management Science, Vol.38, No.1(1992), 926-947.
  21. Wi, T.-S. and S.-K. Kwon, "Transaction Practices Reform in the Wholesale Markets for Strengthening the Competition Power," Korean Journal of Food Marketing Economics, Vol.23, No.3(2006), 113-144.
  22. Wi, T.-S. and S.-K. Kwon, "Reorganization of the Agricultural Wholesale Market," Korean Journal of Food Marketing Economics, Vol.26, No.3(2009), 75-93.