Title :
Neighborhood Triangular Synthetic Minority Over-sampling Technique for Imbalanced Prediction on Small Samples of Chinese Tourism and Hospitality Firms
Author :
Yu-Hui Xu ; Hui Li ; Lu-Ping Le ; Xiao-Yun Tian
Author_Institution :
Sch. of Econ. & Manage., Zhejiang Normal Univ., Jinhua, China
Abstract :
In order to solve the problem of unsatisfactory results of imbalanced risk prediction on minority class samples, we suggested to adjust the up-sampling approach to be the neighborhood triangular synthetic minority over-sampling technique (NT-SMOTE). The new approach that we add the nearest neighbor idea and the triangular area sampling idea to the SMOTE performed better in dealing with samples of minority class by turning imbalanced problems into balanced ones. Thus, performance of single classifiers in predicting risk on imbalanced and small datasets was improved. By using the related knowledge of data excavation principles, the data of listed companies of the Chinese tourism and hospitality industry were processed. Missing samples and missing financial indicators were eliminated. Significant indicators of financial data were filtered out with significance test. Then, NT-SMOTE was used to over-sample minority samples. Further, we used a variety of popular single classifiers of financial risk prediction, including: MDA, DT, LSVM, Logit, and Probit, for risk prediction. These single classifiers improved with NT-SMOTE can reasonably and effectively solve the problem of imbalanced and small sample oriented firm risk prediction.
Keywords :
financial management; pattern classification; risk management; sampling methods; travel industry; Chinese tourism; DT; LSVM; MDA; NT-SMOTE; data excavation principles; financial data filtering; financial risk prediction; firm risk prediction; hospitality firms; hospitality industry; imbalanced risk prediction; logit; minority class samples; missing financial indicators; nearest neighbor idea; neighborhood triangular synthetic minority over-sampling technique; probit; single classifiers; triangular area sampling idea; up-sampling approach; Joints; Optimization; NT-SMOTE; imbalanced datasets; neighborhood triangular; random sampling;
Conference_Titel :
Computational Sciences and Optimization (CSO), 2014 Seventh International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-5371-4
DOI :
10.1109/CSO.2014.104