DocumentCode :
174269
Title :
How to protect investors? A GA-based DWD approach for financial statement fraud detection
Author :
Xinyang Li ; Wei Xu ; Xuesong Tian
Author_Institution :
Sch. of Inf., Renmin Univ. of China, Beijing, China
fYear :
2014
fDate :
5-8 Oct. 2014
Firstpage :
3548
Lastpage :
3554
Abstract :
As one type of the financial fraud, financial statement fraud has not only led to a huge loss for individual investors and financial institutions, but also impacted the overall stability of the whole industry. This paper used financial and textual features extracted from annually submitted 10-k filings and combined data and text mining techniques for detection of financial statement fraud. When the dimension of samples is larger than the sample size, namely high dimension low sample size (HDLSS), distance weighted discrimination (DWD) model, which has a good generalization performance in HDLSS contexts, is used to detect financial statement fraud. We also adopted genetic algorithm to improve the performance of classifiers, including DWD, Support Vector Machine, Back Propagation Neural Networks and Decision Tree for feature selection and parameter optimization. Compared with other GA-based classification models, the proposed GA-based DWD model achieved relatively high classification accuracy with fewer input features, which proves that this model is a promising tool for detection of fraudulent financial statements.
Keywords :
backpropagation; data mining; decision trees; feature extraction; feature selection; financial data processing; fraud; genetic algorithms; investment; neural nets; pattern classification; support vector machines; text analysis; DWD model; GA-based DWD approach; GA-based DWD model; GA-based classification models; HDLSS model; backpropagation neural networks; data mining techniques; decision tree; distance weighted discrimination model; feature selection; financial feature extraction; financial institutions; financial statement fraud detection; fraudulent financial statements; genetic algorithm; high dimension low sample size; investor protection; parameter optimization; support vector machine; text mining techniques; textual feature extraction; Accuracy; Companies; Feature extraction; Genetic algorithms; Optimization; Support vector machines; Testing; DWD model; Feature selection; Genetic algorithm; Text mining; financial statement fraud;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on
Conference_Location :
San Diego, CA
Type :
conf
DOI :
10.1109/SMC.2014.6974480
Filename :
6974480
Link To Document :
بازگشت