• DocumentCode
    174269
  • Title

    How to protect investors? A GA-based DWD approach for financial statement fraud detection

  • Author

    Xinyang Li ; Wei Xu ; Xuesong Tian

  • Author_Institution
    Sch. of Inf., Renmin Univ. of China, Beijing, China
  • fYear
    2014
  • fDate
    5-8 Oct. 2014
  • Firstpage
    3548
  • Lastpage
    3554
  • Abstract
    As one type of the financial fraud, financial statement fraud has not only led to a huge loss for individual investors and financial institutions, but also impacted the overall stability of the whole industry. This paper used financial and textual features extracted from annually submitted 10-k filings and combined data and text mining techniques for detection of financial statement fraud. When the dimension of samples is larger than the sample size, namely high dimension low sample size (HDLSS), distance weighted discrimination (DWD) model, which has a good generalization performance in HDLSS contexts, is used to detect financial statement fraud. We also adopted genetic algorithm to improve the performance of classifiers, including DWD, Support Vector Machine, Back Propagation Neural Networks and Decision Tree for feature selection and parameter optimization. Compared with other GA-based classification models, the proposed GA-based DWD model achieved relatively high classification accuracy with fewer input features, which proves that this model is a promising tool for detection of fraudulent financial statements.
  • Keywords
    backpropagation; data mining; decision trees; feature extraction; feature selection; financial data processing; fraud; genetic algorithms; investment; neural nets; pattern classification; support vector machines; text analysis; DWD model; GA-based DWD approach; GA-based DWD model; GA-based classification models; HDLSS model; backpropagation neural networks; data mining techniques; decision tree; distance weighted discrimination model; feature selection; financial feature extraction; financial institutions; financial statement fraud detection; fraudulent financial statements; genetic algorithm; high dimension low sample size; investor protection; parameter optimization; support vector machine; text mining techniques; textual feature extraction; Accuracy; Companies; Feature extraction; Genetic algorithms; Optimization; Support vector machines; Testing; DWD model; Feature selection; Genetic algorithm; Text mining; financial statement fraud;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on
  • Conference_Location
    San Diego, CA
  • Type

    conf

  • DOI
    10.1109/SMC.2014.6974480
  • Filename
    6974480