Title :
Browser with Clustering of Web Documents
Author :
Tetali, Ravitheja ; Bose, Joy ; Arif, Tasleem
Author_Institution :
WMG Group, Samsung Res. Inst. India Bangalore, Bangalore, India
Abstract :
Accessing relevant information quickly, given limited time and space, is a major issue in Web browsers, especially those in mobile devices. In this paper we propose a framework for grouping similar Web documents in a browser based on similar content of the browsed pages. This grouping can help reduce clutter and enable the user to access relevant Web information quickly. The algorithm we used for clustering is MajorClust, a document similarity algorithm based on tokenizing the words in the document and then determining a cosine similarity measure to estimate the distance between the words. The entire algorithm for clustering is implemented inside the browser without the need of an external Web server. We have implemented and tested the algorithm on a mobile browser and obtained accurate finer clustering of Web pages when compared to Alexa´s sub-categories.
Keywords :
Internet; Web sites; mobile computing; online front-ends; pattern clustering; text analysis; MajorClust; Web browsers; Web documents clustering; Web information; Web pages clustering; cosine similarity measure; document similarity algorithm; document words tokenizing; mobile browser; Browsers; Clustering algorithms; Engines; History; Mobile communication; Mobile handsets; Web pages; MajorClust; Web browser; Web history; document clustering; intelligent browsing;
Conference_Titel :
Advanced Computing, Networking and Security (ADCONS), 2013 2nd International Conference on
Conference_Location :
Mangalore
DOI :
10.1109/ADCONS.2013.20