Title :
A modified fuzzy relational clustering approach for sentence-level text
Author :
Sikder Tahsin Al-Amin;Mahade Hasan;M. M. A. Hashem
Author_Institution :
Department of Computer Science and Engineering, Khulna University of Engineering and Technology, 9203, Bangladesh
Abstract :
This paper proposes a fuzzy relational clustering (FRC) to find similar sentences from a set of sentences as well as group them in clusters. For finding similar sentences here FRC used both word-to-word and order similarity. For word-to-word similarity FRC used Jiang and Conrath similarity measure (JnC) with the help of WordNet database. Order similarity is calculated from joint word set. As a sentence may relate to more than one theme so FRC used a fuzzy clustering approach. Here FRC used FRECCA algorithm for the sentence clustering purpose. The algorithm works on Expectation-Maximization where importance of a sentence is expressed by PageRank score which is treated as likelihood. The PageRank scores and mixing coefficients are initialized with Uniform Random Number generation technique. Applying this method on a quotation dataset of different classes we found that it is capable of identifying and grouping similar sentences in a cluster. FRC is also applied on a news article dataset and found admirable results.
Keywords :
"Clustering algorithms","Mathematical model","Databases","Convergence","Electronic mail","Random number generation","Data mining"
Conference_Titel :
Electrical Information and Communication Technology (EICT), 2015 2nd International Conference on
Print_ISBN :
978-1-4673-9256-3
DOI :
10.1109/EICT.2015.7392016