• DocumentCode
    2415662
  • Title

    Automating Data Preprocessing with DMPML and KDDML

  • Author

    Goncalves Jr., Paulo M. ; Barros, Roberto S. M.

  • fYear
    2011
  • fDate
    16-18 May 2011
  • Firstpage
    97
  • Lastpage
    103
  • Abstract
    This paper presents a graphical application for the Data Mining Preparation Markup Language (DMPML), which is an XML application designed to represent the data preparation phase of the KDD process. DMPML supports the reuse of data preprocessing directives using XSLT to map raw data into data ready to be used by many data mining algorithms. The application presented here, DMPML-TS, automates the data preparation phase, speeding up the codification and transformation of data, and providing support to facilitate the use of different data mining algorithms in the same and/or similar data, based on their codification stored in separate XML documents. This paper also presents improvements made to DMPML like the adoption of XRFF for input and output data and the use of only one XSLT file for data transformation. We also present the integration of DMPML-TS and KDDML, an XML language used to represent data, mining models, and queries.
  • Keywords
    Data mining; Data models; Data preprocessing; Databases; Machine learning algorithms; XML; DMPML; Data Preparation; KDDML; XML; XSLT;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Science (ICIS), 2011 IEEE/ACIS 10th International Conference on
  • Conference_Location
    Sanya, China
  • Print_ISBN
    978-1-4577-0141-2
  • Type

    conf

  • DOI
    10.1109/ICIS.2011.23
  • Filename
    6086455