• Title/Summary/Keyword: default prediction

Search Result 60, Processing Time 0.022 seconds

Default Prediction for Real Estate Companies with Imbalanced Dataset

  • Dong, Yuan-Xiang;Xiao, Zhi;Xiao, Xue
    • Journal of Information Processing Systems
    • /
    • v.10 no.2
    • /
    • pp.314-333
    • /
    • 2014
  • When analyzing default predictions in real estate companies, the number of non-defaulted cases always greatly exceeds the defaulted ones, which creates the two-class imbalance problem. This lowers the ability of prediction models to distinguish the default sample. In order to avoid this sample selection bias and to improve the prediction model, this paper applies a minority sample generation approach to create new minority samples. The logistic regression, support vector machine (SVM) classification, and neural network (NN) classification use an imbalanced dataset. They were used as benchmarks with a single prediction model that used a balanced dataset corrected by the minority samples generation approach. Instead of using prediction-oriented tests and the overall accuracy, the true positive rate (TPR), the true negative rate (TNR), G-mean, and F-score are used to measure the performance of default prediction models for imbalanced dataset. In this paper, we describe an empirical experiment that used a sampling of 14 default and 315 non-default listed real estate companies in China and report that most results using single prediction models with a balanced dataset generated better results than an imbalanced dataset.

Default Prediction of Automobile Credit Based on Support Vector Machine

  • Chen, Ying;Zhang, Ruirui
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.75-88
    • /
    • 2021
  • Automobile credit business has developed rapidly in recent years, and corresponding default phenomena occur frequently. Credit default will bring great losses to automobile financial institutions. Therefore, the successful prediction of automobile credit default is of great significance. Firstly, the missing values are deleted, then the random forest is used for feature selection, and then the sample data are randomly grouped. Finally, six prediction models of support vector machine (SVM), random forest and k-nearest neighbor (KNN), logistic, decision tree, and artificial neural network (ANN) are constructed. The results show that these six machine learning models can be used to predict the default of automobile credit. Among these six models, the accuracy of decision tree is 0.79, which is the highest, but the comprehensive performance of SVM is the best. And random grouping can improve the efficiency of model operation to a certain extent, especially SVM.

Stress Test on a Shipping Company's Financial Stability (스트레스 테스트를 활용한 해운기업 안정성 연구)

  • Park, Sunghwa;Kwon, Janghan
    • Journal of Korea Port Economic Association
    • /
    • v.39 no.2
    • /
    • pp.97-110
    • /
    • 2023
  • This study examines the effect of macroeconomic shocks on the financial stability of the Korean shipping industry. Using Firth logistic regression model, this study estimates the default probability of a shipping company. The results from a default prediction model suggest that total assets are negatively correlated with default probability, while total debt is positively correlated with default probability. Based on the results from a default prediction model, this study investigates the effect of macroeconomic shocks, namely total assets, sales, and total debt shocks, on a shipping company's default probability. The stress test results indicate that a decrease in sales and total assets significantly deteriorates the financial stability of a shipping company.

TeGCN:Transformer-embedded Graph Neural Network for Thin-filer default prediction (TeGCN:씬파일러 신용평가를 위한 트랜스포머 임베딩 기반 그래프 신경망 구조 개발)

  • Seongsu Kim;Junho Bae;Juhyeon Lee;Heejoo Jung;Hee-Woong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.419-437
    • /
    • 2023
  • As the number of thin filers in Korea surpasses 12 million, there is a growing interest in enhancing the accuracy of assessing their credit default risk to generate additional revenue. Specifically, researchers are actively pursuing the development of default prediction models using machine learning and deep learning algorithms, in contrast to traditional statistical default prediction methods, which struggle to capture nonlinearity. Among these efforts, Graph Neural Network (GNN) architecture is noteworthy for predicting default in situations with limited data on thin filers. This is due to their ability to incorporate network information between borrowers alongside conventional credit-related data. However, prior research employing graph neural networks has faced limitations in effectively handling diverse categorical variables present in credit information. In this study, we introduce the Transformer embedded Graph Convolutional Network (TeGCN), which aims to address these limitations and enable effective default prediction for thin filers. TeGCN combines the TabTransformer, capable of extracting contextual information from categorical variables, with the Graph Convolutional Network, which captures network information between borrowers. Our TeGCN model surpasses the baseline model's performance across both the general borrower dataset and the thin filer dataset. Specially, our model performs outstanding results in thin filer default prediction. This study achieves high default prediction accuracy by a model structure tailored to characteristics of credit information containing numerous categorical variables, especially in the context of thin filers with limited data. Our study can contribute to resolving the financial exclusion issues faced by thin filers and facilitate additional revenue within the financial industry.

Performance Evaluation and Forecasting Model for Retail Institutions (유통업체의 부실예측모형 개선에 관한 연구)

  • Kim, Jung-Uk
    • Journal of Distribution Science
    • /
    • v.12 no.11
    • /
    • pp.77-83
    • /
    • 2014
  • Purpose - The National Agricultural Cooperative Federation of Korea and National Fisheries Cooperative Federation of Korea have prosecuted both financial and retail businesses. As cooperatives are public institutions and receive government support, their sound management is required by the Financial Supervisory Service in Korea. This is mainly managed by CAEL, which is changed by CAMEL. However, NFFC's business section, managing the finance and retail businesses, is unified and evaluated; the CAEL model has an insufficient classification to evaluate the retail industry. First, there is discrimination power as regards CAEL. Although the retail business sector union can receive a higher rating on a CAEL model, defaults have often been reported. Therefore, a default prediction model is needed to support a CAEL model. As we have the default prediction model using a subdivision of indexes and statistical methods, it can be useful to have a prevention function through the estimation of the retail sector's default probability. Second, separating the difference between the finance and retail business sectors is necessary. Their businesses have different characteristics. Based on various management indexes that have been systematically managed by the National Fisheries Cooperative Federation of Korea, our model predicts retail default, and is better than the CAEL model in its failure prediction because it has various discriminative financial ratios reflecting the retail industry situation. Research design, data, and methodology - The model to predict retail default was presented using logistic analysis. To develop the predictive model, we use the retail financial statements of the NFCF. We consider 93 unions each year from 2006 to 2012 to select confident management indexes. We also adapted the statistical power analysis that is a t-test, logit analysis, AR (accuracy ratio), and AUROC (Area Under Receiver Operating Characteristic) analysis. Finally, through the multivariate logistic model, we show that it is excellent in its discrimination power and higher in its hit ratio for default prediction. We also evaluate its usefulness. Results - The statistical power analysis using the AR (AUROC) method on the short term model shows that the logistic model has excellent discrimination power, with 84.6%. Further, it is higher in its hit ratio for failure (prediction) of total model, at 94%, indicating that it is temporally stable and useful for evaluating the management status of retail institutions. Conclusions - This model is useful for evaluating the management status of retail union institutions. First, subdividing CAEL evaluation is required. The existing CAEL evaluation is underdeveloped, and discrimination power falls. Second, efforts to develop a varied and rational management index are continuously required. An index reflecting retail industry characteristics needs to be developed. However, extending this study will need the following. First, it will require a complementary default model reflecting size differences. Second, in the case of small and medium retail, it will need non-financial information. Therefore, it will be a hybrid default model reflecting financial and non-financial information.

Semi-Supervised Learning to Predict Default Risk for P2P Lending (준지도학습 기반의 P2P 대출 부도 위험 예측에 대한 연구)

  • Kim, Hyun-jung
    • Journal of Digital Convergence
    • /
    • v.20 no.4
    • /
    • pp.185-192
    • /
    • 2022
  • This study investigates the effect of the semi-supervised learning(SSL) method on predicting default risk of peer-to-peer(P2P) loans. Despite its proven performance, the supervised learning(SL) method requires labeled data, which may require a lot of effort and resources to collect. With the rapid growth of P2P platforms, the number of loans issued annually that have no clear final resolution is continuously increasing leading to abundance in unlabeled data. The research data of P2P loans used in this study were collected on the LendingClub platform. This is why an SSL model is needed to predict the default risk by using not only information from labeled loans(fully paid or defaulted) but also information from unlabeled loans. The results showed that in terms of default risk prediction and despite the use of a small number of labeled data, the SSL method achieved a much better default risk prediction performance than the SL method trained using a much larger set of labeled data.

The Credit Information Feature Selection Method in Default Rate Prediction Model for Individual Businesses (개인사업자 부도율 예측 모델에서 신용정보 특성 선택 방법)

  • Hong, Dongsuk;Baek, Hanjong;Shin, Hyunjoon
    • Journal of the Korea Society for Simulation
    • /
    • v.30 no.1
    • /
    • pp.75-85
    • /
    • 2021
  • In this paper, we present a deep neural network-based prediction model that processes and analyzes the corporate credit and personal credit information of individual business owners as a new method to predict the default rate of individual business more accurately. In modeling research in various fields, feature selection techniques have been actively studied as a method for improving performance, especially in predictive models including many features. In this paper, after statistical verification of macroeconomic indicators (macro variables) and credit information (micro variables), which are input variables used in the default rate prediction model, additionally, through the credit information feature selection method, the final feature set that improves prediction performance was identified. The proposed credit information feature selection method as an iterative & hybrid method that combines the filter-based and wrapper-based method builds submodels, constructs subsets by extracting important variables of the maximum performance submodels, and determines the final feature set through prediction performance analysis of the subset and the subset combined set.

Financial Distress Prediction Using Adaboost and Bagging in Pakistan Stock Exchange

  • TUNIO, Fayaz Hussain;DING, Yi;AGHA, Amad Nabi;AGHA, Kinza;PANHWAR, Hafeez Ur Rehman Zubair
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.1
    • /
    • pp.665-673
    • /
    • 2021
  • Default has become an extreme concern in the current world due to the financial crisis. The previous prediction of companies' bankruptcy exhibits evidence of decision assistance for financial and regulatory bodies. Notwithstanding numerous advanced approaches, this area of study is not outmoded and requires additional research. The purpose of this research is to find the best classifier to detect a company's default risk and bankruptcy. This study used secondary data from the Pakistan Stock Exchange (PSX) and it is time-series data to examine the impact on the determinants. This research examined several different classifiers as per their competence to properly categorize default and non-default Pakistani companies listed on the PSX. Additionally, PSX has remained consistent for some years in terms of growth and has provided benefits to its stockholders. This paper utilizes machine learning techniques to predict financial distress in companies listed on the PSX. Our results indicate that most multi-stage mixture of classifiers provided noteworthy developments over the individual classifiers. This means that firms will have to work on the financial variables such as liquidity and profitability to not fall into the category of liquidation. Moreover, Adaptive Boosting (Adaboost) provides a significant boost in the performance of each classifier.

Study on Default Prediction Model of Policy Fund (정책자금지원 부실예측 모형 연구)

  • Lim, Sangseop
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.713-714
    • /
    • 2021
  • 소상공인은 우리나라 경제의 중요한 역할을 하는 경제적 근간이루고 있지만 상대적으로 영세하고 경영여건이 불안하다. 정부정책적인 자금지원이 필요하나 재원의 한계로 효율적인 자본분배가 필요하다. 따라서 본 논문은 랜덤포레스트 모형을 활용하여 소상공인 정책자금 대출에 관한 부실예측모형을 개발함으로써 부실징후를 사전에 파악하고 예방함으로써 사회적비용을 절감하고 자원의 효율적 분배에 기여하고자 한다.

  • PDF

Artificial Intelligence Techniques for Predicting Online Peer-to-Peer(P2P) Loan Default (인공지능기법을 이용한 온라인 P2P 대출거래의 채무불이행 예측에 관한 실증연구)

  • Bae, Jae Kwon;Lee, Seung Yeon;Seo, Hee Jin
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.3
    • /
    • pp.207-224
    • /
    • 2018
  • In this article, an empirical study was conducted by using public dataset from Lending Club Corporation, the largest online peer-to-peer (P2P) lending in the world. We explore significant predictor variables related to P2P lending default that housing situation, length of employment, average current balance, debt-to-income ratio, loan amount, loan purpose, interest rate, public records, number of finance trades, total credit/credit limit, number of delinquent accounts, number of mortgage accounts, and number of bank card accounts are significant factors to loan funded successful on Lending Club platform. We developed online P2P lending default prediction models using discriminant analysis, logistic regression, neural networks, and decision trees (i.e., CART and C5.0) in order to predict P2P loan default. To verify the feasibility and effectiveness of P2P lending default prediction models, borrower loan data and credit data used in this study. Empirical results indicated that neural networks outperforms other classifiers such as discriminant analysis, logistic regression, CART, and C5.0. Neural networks always outperforms other classifiers in P2P loan default prediction.