• DocumentCode
    121773
  • Title

    Analysis for classification of similar documents among various websites using rapid miner

  • Author

    Kaur, Prabhdeep ; Khurm, Sawtantar Singh ; Josan, Gurpreet Singh

  • Author_Institution
    Dept. of CSE, ACE, Ambala, India
  • fYear
    2014
  • fDate
    7-8 Feb. 2014
  • Firstpage
    465
  • Lastpage
    470
  • Abstract
    The Web was intended to improve the management of general information about accelerators and experiments. It is also considered the most precious place for Information Retrieval and Knowledge Discovery. While retrieving information through queries inserted by the users, a search engine results in a large and non manageable collection of documents. Several web mining tools are used to classify, analyse and order the documents so that users can easily navigate through the search results and find the desired documents. A more efficient way to organize the documents can be a combination of similarity and ranking, where similarity can group the documents in terms of contents or distance and ranking can be applied for ordering the pages within each cluster or set. Based on this approach, in this paper, an analysis is being shown that provides ordered results in the form of similar documents among several set of website which are of users interest using an open source web mining tool called as rapid miner. This approach helps user to restrict their search to navigate less number of pages instead of huge documents in particular which are of their interest.
  • Keywords
    Web sites; data mining; document handling; pattern classification; public domain software; query processing; search engines; Websites; document organization; information retrieval; knowledge discovery; open source Web mining tool; query; ranking; rapid miner; search engine; similar document classification analysis; similarity; Navigation; World Wide Web; Document; Page Rank; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Issues and Challenges in Intelligent Computing Techniques (ICICT), 2014 International Conference on
  • Conference_Location
    Ghaziabad
  • Type

    conf

  • DOI
    10.1109/ICICICT.2014.6781327
  • Filename
    6781327