• DocumentCode
    945705
  • Title

    Mining Impact-Targeted Activity Patterns in Imbalanced Data

  • Author

    Cao, Longbing ; Zhao, Yanchang ; Zhang, Chengqi

  • Author_Institution
    Dept. of Software Eng., Univ. of Technol., Sydney, NSW
  • Volume
    20
  • Issue
    8
  • fYear
    2008
  • Firstpage
    1053
  • Lastpage
    1066
  • Abstract
    Impact-targeted activities are rare but lead to significant impact on the society, e.g., isolated terrorism activities may lead to a disastrous event threatening national security. Similar issues can also be seen in many other areas. Therefore, it is important to identify such particular activities before they lead to significant impact to the world. However, it is challenging to mine impact-targeted activity patterns due to its imbalanced structure. This paper develops techniques for discovering such activity patterns. First, the complexities of mining imbalanced impact-targeted activities are analyzed.We then discuss strategies for constructing impact-targeted activity sequences. Algorithms are developed to mine frequent positive-impact (P rarr T) and negative-impact (P rarr Tmacr macr) oriented activity patterns, sequential impact-contrasted activity patterns (P is frequently associated with both pattern P rarr T and P rarr Tmacr macr in separated data sets), and sequential impact-reversed activity patterns (both P rarr T and PQ rarr T macr are frequent). Activity impact modelling is also studied to quantify pattern impact on business outcomes. Social security debt-related activity data is used to test the proposed approaches. The outcomes show that they are promising for ISI applications to identify impact-targeted activity patterns in imbalanced data.
  • Keywords
    data mining; national security; security of data; imbalanced data; impact-targeted activity pattern mining; information security; national security; pattern discovery; social security; Clustering; and association rules; classification; data mining;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2007.190635
  • Filename
    4358938