• DocumentCode
    2210857
  • Title

    Frequent Instruction Sequential Pattern Mining in Hardware Sample Data

  • Author

    Zou, Jia ; Xiao, Jing ; Hou, Rui ; Wang, Yanqi

  • Author_Institution
    Res. Lab., IBM China, Beijing, China
  • fYear
    2010
  • fDate
    13-17 Dec. 2010
  • Firstpage
    1205
  • Lastpage
    1210
  • Abstract
    When parallelism and heterogeneity has become the trend for computer system design, both the size and the complexity of the hardware sample data generated by Performance Monitoring Unit (PMU) increase fast, thus automatic analysis methods, i.e. data mining methods, are urgently needed to accelerate hardware sample data analysis. We are the first to study instruction sequential pattern mining for hardware sample data. It is a challenging task due to the implicit sequential relationship contained in the data and due to the importance of low frequency patterns. As a solution, we i) provide a novel algorithm ProfSpan, ii) adapt two existing algorithms, which are based on candidate generation and projected database generation, to hardware sample data. Our evaluation results show ProfSpan can reduce up to 75% and 80% of execution time compared with other two algorithms. Particularly, up to 50% of frequent patterns mined by ProfSpan in hardware sample data are crossing basic block boundaries and can not be found by existing methods for source code or disassembly code. We also analyze three example patterns identified by ProfSpan: consecutive loads, JIT entry sequence, and conditional code dependency sequence, to illustrate how ProfSpan can benefit performance analysis. Finally, we apply patterns to module classification and obtain promising results.
  • Keywords
    data analysis; data mining; database management systems; parallel processing; source coding; JIT entry sequence; ProfSpan; automatic analysis methods; candidate generation; computer system design; conditional code dependency sequence; data mining; database generation; disassembly code; hardware sample data analysis; instruction sequential pattern mining; pattern classification; performance monitoring unit; source code; hardware sample data; performance analysis; sequential patter mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2010 IEEE 10th International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-4786
  • Print_ISBN
    978-1-4244-9131-5
  • Electronic_ISBN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2010.123
  • Filename
    5694109