• DocumentCode
    688264
  • Title

    Using Traditional Data Analysis Algorithms to Detect Access Patterns for Massive Data Processing

  • Author

    Jiaqi Zhao ; Jie Tao ; Lizhe Wang ; Ranjan, Rajiv ; Kolodziej, Joanna

  • Author_Institution
    Sch. of Basic Sci., Changchun Univ. of Technol., Changchun, China
  • fYear
    2013
  • fDate
    13-15 Nov. 2013
  • Firstpage
    1097
  • Lastpage
    1104
  • Abstract
    The data sets produced in our daily life is getting larger and larger. How to manage and analyze such big data is currently a grand challenge for scientists in various research fields. MapReduce is regarded as an appropriate programming model for processing such big data. However, the users or developers still need to efficiently program appropriate data processing actions related to their analytics requirements. In other words analytics actions in MapReduce is not portable across different big data types. In this paper we propose to adopt traditional data clustering algorithms to automatically analyze large data sets. We applied this approach to process performance data on distributed shared memory machines for detecting the application access patterns. The advantage is that application developers need not write codes to understand the runtime access behavior of their applications. We optimized several benchmark applications based on the analysis results and the experiments show a considerable improvement in terms of execution time and speedup.
  • Keywords
    Big Data; data analysis; pattern clustering; shared memory systems; Bid Data analysis; Big Data management; MapReduce programming model; application access pattern detection; data clustering algorithms; distributed shared memory machines; large data set analysis; massive data processing; runtime access behavior; traditional data analysis algorithms; Clustering algorithms; Decision trees; Distributed databases; Monitoring; Optimization; Runtime; Code Optimization; Data Analysis; Data Locality; Distributed Shared Memory; Memory Performance;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), 2013 IEEE 10th International Conference on
  • Conference_Location
    Zhangjiajie
  • Type

    conf

  • DOI
    10.1109/HPCC.and.EUC.2013.155
  • Filename
    6832037