Title :
Analysis for classification of similar documents among various websites using rapid miner
Author :
Kaur, Prabhdeep ; Khurm, Sawtantar Singh ; Josan, Gurpreet Singh
Author_Institution :
Dept. of CSE, ACE, Ambala, India
Abstract :
The Web was intended to improve the management of general information about accelerators and experiments. It is also considered the most precious place for Information Retrieval and Knowledge Discovery. While retrieving information through queries inserted by the users, a search engine results in a large and non manageable collection of documents. Several web mining tools are used to classify, analyse and order the documents so that users can easily navigate through the search results and find the desired documents. A more efficient way to organize the documents can be a combination of similarity and ranking, where similarity can group the documents in terms of contents or distance and ranking can be applied for ordering the pages within each cluster or set. Based on this approach, in this paper, an analysis is being shown that provides ordered results in the form of similar documents among several set of website which are of users interest using an open source web mining tool called as rapid miner. This approach helps user to restrict their search to navigate less number of pages instead of huge documents in particular which are of their interest.
Keywords :
Web sites; data mining; document handling; pattern classification; public domain software; query processing; search engines; Websites; document organization; information retrieval; knowledge discovery; open source Web mining tool; query; ranking; rapid miner; search engine; similar document classification analysis; similarity; Navigation; World Wide Web; Document; Page Rank; World Wide Web;
Conference_Titel :
Issues and Challenges in Intelligent Computing Techniques (ICICT), 2014 International Conference on
Conference_Location :
Ghaziabad
DOI :
10.1109/ICICICT.2014.6781327