Title :
Adopting Text Clustering in web-based application to facilitate searching of education information
Author :
Nilsson, Nicklas ; Yan Liu
Author_Institution :
Sch. of Software Eng., Tongji Univ., Shanghai, China
Abstract :
Clustering, as a part of the Data Mining field, has been in the center of the research attention for the last decade. It is the task of finding subsets of data that are sharing the same type of attributes. Text Clustering becomes one of the most critical and important solutions in data mining to discover knowledge from fast grow up web data and log files. There are many challenges, algorithms needs to be tailored specific for each domain and scale well with growing data sets. Another interesting aspect is the design of the system. A complex set components need to interact well together. This article proposes an elegant way of clustering university educations based on their text attributes. The solution is integrated directly into a Spring Web Application. A comprehensive architecture is proposed, providing the frameworks needed. Clustering techniques such as Canopy Generation [4] and k-Means are demonstrated.
Keywords :
Web sites; data mining; educational administrative data processing; educational institutions; pattern clustering; text analysis; Web-based application; canopy generation; data mining; k-means clustering; knowledge discovery; text clustering; university education information searching; Clustering algorithms; Educational institutions; Indexes; Preforms; Servers; Apache Hadoop; Apache Ma-hout; Canopy Generation; Hibernate; Lucene Index; Spring; k-Means;
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-3278-8
DOI :
10.1109/ICSESS.2014.6933590