• DocumentCode
    658608
  • Title

    DM-Midware: A Middleware to Enable High Performance Data Mining in Heterogeneous Cloud

  • Author

    Guoyu Ou ; Ying Liu ; Xinyu Ma ; Cheng Wang

  • Author_Institution
    Sch. of Comput. & Control, Univ. of Chinese Acad. of Sci., Beijing, China
  • Volume
    3
  • fYear
    2013
  • fDate
    17-20 Nov. 2013
  • Firstpage
    70
  • Lastpage
    73
  • Abstract
    Cloud computing has become a popular high performance computing model where resources are provided as services over the Web. Users are starting to adopt cloud model in data mining applications. However, due to the complexity of parallel/cloud computing, it is difficult for average users to express a parallel computing paradigm for their applications in cloud. In order to isolate users from the complexity of parallel/cloud programming, a middleware to enable high performance data mining, called DM-Midware, is proposed. It hides the details of MapReduce programming from users by automatically launching mappers through a set of user programming APIs. Directive-based parallelization scheme automatically "translates" a serial program into a SMP or Multi-core based parallel program. Heterogeneous computing resources can be invoked to conduct parallel execution by API-based scheme. A two-step scheduling scheme is proposed to maximize the throughput of the cloud system. We evaluate DM-Midware by executing a representative data mining algorithm in a private cloud. Experimental results demonstrate good scalability and adaptability.
  • Keywords
    Web services; cloud computing; data mining; middleware; parallel programming; scheduling; DM-Midware; MapReduce programming; SMP; Web service; cloud computing; directive-based parallelization scheme; heterogeneous cloud; heterogeneous computing resources; high performance computing model; middleware; multicore based parallel program; parallel computing; parallel execution; private cloud; representative data mining algorithm; serial program; two-step scheduling scheme; user programming APIs; Algorithm design and analysis; Cloud computing; Data mining; Graphics processing units; Parallel processing; Programming; cloud computing; data mining; parallel computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Atlanta, GA
  • Print_ISBN
    978-1-4799-2902-3
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2013.152
  • Filename
    6690697