• DocumentCode
    480771
  • Title

    Learning Classifiers from Large Databases Using Statistical Queries

  • Author

    Koul, Neeraj ; Caragea, Cornelia ; Honavar, Vasant ; Bahirwani, Vikas ; Caragea, Doina

  • Author_Institution
    Iowa State Univ., Ames, IA
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    923
  • Lastpage
    926
  • Abstract
    We describe an approach to learning predictive models from large databases in settings where direct access to data is not available because of massive size of data, access restrictions, or bandwidth requirements. We outline some techniques for minimizing the number of statistical queries needed; and for efficiently coping with missing values in the data. We provide open source implementation of the decision tree and naive Bayes algorithms to demonstrate the feasibility of the proposed approach.
  • Keywords
    Bayes methods; decision trees; learning (artificial intelligence); pattern classification; query processing; very large databases; access restriction; bandwidth requirement; decision tree; large database; learning classifier predictive model; naive Bayes algorithm; statistical queries minimization; Bandwidth; Costs; Decision trees; Deductive databases; Humans; Intelligent agent; Predictive models; Relational databases; Statistics; Virtual colonoscopy; Decision Trees; INDUS; Machine Learning; Missing Values; Naive Bayes; Sufficient Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.366
  • Filename
    4740577