• DocumentCode
    1946894
  • Title

    AK-Modes: A weighted clustering algorithm for finding similar case subsets

  • Author

    Ma, Lianhang ; Chen, Yefang ; Huang, Hao

  • Author_Institution
    Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
  • fYear
    2010
  • fDate
    15-16 Nov. 2010
  • Firstpage
    218
  • Lastpage
    223
  • Abstract
    Finding similar crime case subsets is an important task for intelligence analysts in crime investigation. It can not only provide multiple clues to solve crimes but also improve efficiency to catch the criminals. However, the conventional approach by querying specific attributes in relational databases has two defects: first, it is relatively of poor efficiency when a lot of incidents have to be handled; second, the querying process can not reflect the importance of attributes in different case categories. In this paper, we propose a two-phase clustering algorithm called AK-Modes to automatically find the similar case subsets from large datasets. In the attribute-weighing phase, we compute the weight of each attribute related to an offender´s behavior trait using the concept Information Gain Ratio (IGR) in classification domain. Then the result of attribute-weighing phase is utilized in the clustering process to find the similar case subsets. Experiments show that AK-Modes is effective and can find significant results.
  • Keywords
    pattern clustering; police data processing; relational databases; AK-Modes; crime investigation; information gain ratio; intelligence analysts; relational databases; similar case subset finding; two-phase clustering algorithm; weighted clustering algorithm; Accuracy; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Databases; Training data; K-Modes clustering; crime data mining; information gain ratio; weighted attributes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-6791-4
  • Type

    conf

  • DOI
    10.1109/ISKE.2010.5680876
  • Filename
    5680876