DocumentCode :
3768203
Title :
Efficient search engine approach for measuring similarity between words: Using page count and snippets
Author :
P. Murugesan;K. Malathi
Author_Institution :
PG student computer science and engineering, Indian Institute of Information Technology, Srirangam Tiruchirappalli
fYear :
2015
Firstpage :
1
Lastpage :
5
Abstract :
Web mining involve activities such as document clustering, community mining etc., to be performed on web. Such tasks need measuring semantic similarity between word. This helps in performing web mining activities easily in many applications. The accurate measures of semantic similarity between any two words is the difficult task. A new approach to measure similarity between words is based on text snippets and page count. These two measures are taken from the results of a search engine like Google. The lexical patterns are extracted from text snippets and word co-occurrence measures are defined using page count. The results of these two are combined. Moreover, the pattern clustering and pattern extraction algorithm are used to find various relationships between any two given words. Support Vector Machines is used to optimize the result. The empirical results reveal that the techniques are finding the best results that can be compared with human ratings and accuracy in web mining activity. Semantic similarity refers to the concept by which a set of document or words within the document are assigned a weight based on their meaning. The accurate measurement of such similarity plays an important role in Natural language Processing.
Keywords :
"Semantics","Search engines","Engines","Web search","Pattern clustering","Clustering algorithms","Mutual information"
Publisher :
ieee
Conference_Titel :
Green Engineering and Technologies (IC-GET), 2015 Online International Conference on
Type :
conf
DOI :
10.1109/GET.2015.7453830
Filename :
7453830
Link To Document :
بازگشت