DocumentCode :
240706
Title :
Modelling on clustering algorithm based on iteration feature selection for micro-blog posts
Author :
Kai Gao ; Bao-quan Zhang
Author_Institution :
Sch. of Inf. Sci. & Eng., Hebei Univ. of Sci. & Technol., Shijiazhuang, China
fYear :
2014
fDate :
3-5 Dec. 2014
Firstpage :
295
Lastpage :
299
Abstract :
With the coming of big data era, data mining and intelligent processing become more and more important, and modelling on novel intelligent processing is necessary. As micro-blog posts´ properties on short texts, together with their linguistic unreliable features and the incompleteness of lexical, it is necessary to analyze and cluster these similar posts together for the further data mining and recommendation. This paper takes advantage of the classical clustering algorithm of k-means, and then presents a novel modelling approach to partition the big data into the corresponding k groups. Furthermore, a text feature selection model based on 2-phase iteration is proposed. Based on this model, a micro-blog post clustering algorithm is present. The proposed algorithm takes use of the partition idea and avoids the influence of noise data. Experiment shows the feasible of the proposed approach, and some existing problems and further works are also presented in the end.
Keywords :
Big Data; Web sites; data mining; feature selection; recommender systems; social networking (online); big data; data mining; intelligent processing; iteration feature selection; microblog post clustering algorithm; text feature selection model; Clustering algorithms; Data mining; Data models; Feature extraction; Noise; Pragmatics; Vectors; Micro-blog; data mining; feature selection; text cluster;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Modelling, Identification & Control (ICMIC), 2014 Proceedings of the 6th International Conference on
Conference_Location :
Melbourne, VIC
Type :
conf
DOI :
10.1109/ICMIC.2014.7020768
Filename :
7020768
Link To Document :
بازگشت