• DocumentCode
    3144873
  • Title

    An integrated data mining system to automate discovery of measures of association

  • Author

    Chua, Cecil ; Chiang, Roger H L ; Lim, Ee-Peng

  • Author_Institution
    Sch. of Accountacy & Bus., Nanyang Technol. Univ., Singapore
  • fYear
    2000
  • fDate
    4-7 Jan. 2000
  • Abstract
    Many data analysts require tools which can integrate their database management packages (e.g. Microsoft Access) with their data analysis ones (e.g. SAS, SPSS), and provide guidance for the selection of appropriate mining algorithms. In addition, the analysts need to extract and validate statistical results to facilitate data mining. In this paper, we describe an integrated data mining system called the Linear Correlation Discovery System (LCDS) that meets the above requirement. LCDS consists of four major sub-components, two of which, the selection assistant and the statistics coupler, we discussed in this paper. The former examines the scheme and instances to determine appropriate association measurement functions (e.g, chi-square, linear regression, ANOVA). The latter involves the appropriate statistical function on a sample data set, and extracts relevant statistical output such as η2, and R2 for effective mining of data. We also describe a new validation algorithm based on measuring the consistency of mining results applied to multiple test sets.
  • Keywords
    data analysis; data mining; Linear Correlation Discovery System; data analysis; integrated data mining; measures of association; validation algorithm; Analysis of variance; Data analysis; Data mining; Databases; Linear regression; Liquid crystal displays; Packaging; Statistics; Synthetic aperture sonar; Time of arrival estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    System Sciences, 2000. Proceedings of the 33rd Annual Hawaii International Conference on
  • Print_ISBN
    0-7695-0493-0
  • Type

    conf

  • DOI
    10.1109/HICSS.2000.926650
  • Filename
    926650