• DocumentCode
    151466
  • Title

    A real time clustering method using document index graph

  • Author

    Akthar, Nadeem ; Ahamad, Mohd Vasim ; Khan, Azeem Ush Shan

  • Author_Institution
    Dept. of Comput. Eng., Aligarh Muslim Univ., Aligarh, India
  • fYear
    2014
  • fDate
    5-6 Sept. 2014
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    From a previous survey, 45% of users did not get what they are actually looking for in the web using any search engine. Suppose, you have a million of text file in your server or in your computer, then there is a need to categorize them on the basis of their content in a very efficient way. As a result, IR (Information Retrieval) tool has been developed, it provides a more effective ways for users to categorize relevant data. Most of the clustering algorithm like Vector Space Model considers only single words but it is not incremental so it can´t be applied on-line and another algorithm, STC, involves `trie´ concept to identify shared phrases suitable to apply on-line, but the main problem is, it doesn´t work for large number of data set. In this paper, we have introduced DIGE clustering algorithm which generates the clusters based on the common phrases and also on the single terms. DIGE clustering algorithm based on the DIG model for the representation of documents. The construction of DIG model is incremental, so DIGE is also capable to produce cluster using online document and also it doesn´t occupy much memory, so also applicable for offline.
  • Keywords
    Internet; document handling; graph theory; information retrieval; pattern clustering; DIG model; DIGE clustering algorithm; IR tool; document index graph; document representation; information retrieval tool; online document; real time clustering method; search engine; Algorithm design and analysis; Clustering algorithms; Indexes; Merging; Rivers; Search engines; Vectors; Clustering; Document Index Graph; Incremental Algorithm; Phrase Cluster; Suffix Tree Clustering; Web-Snippets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining and Intelligent Computing (ICDMIC), 2014 International Conference on
  • Conference_Location
    New Delhi
  • Print_ISBN
    978-1-4799-4675-4
  • Type

    conf

  • DOI
    10.1109/ICDMIC.2014.6954222
  • Filename
    6954222