DocumentCode
240706
Title
Modelling on clustering algorithm based on iteration feature selection for micro-blog posts
Author
Kai Gao ; Bao-quan Zhang
Author_Institution
Sch. of Inf. Sci. & Eng., Hebei Univ. of Sci. & Technol., Shijiazhuang, China
fYear
2014
fDate
3-5 Dec. 2014
Firstpage
295
Lastpage
299
Abstract
With the coming of big data era, data mining and intelligent processing become more and more important, and modelling on novel intelligent processing is necessary. As micro-blog posts´ properties on short texts, together with their linguistic unreliable features and the incompleteness of lexical, it is necessary to analyze and cluster these similar posts together for the further data mining and recommendation. This paper takes advantage of the classical clustering algorithm of k-means, and then presents a novel modelling approach to partition the big data into the corresponding k groups. Furthermore, a text feature selection model based on 2-phase iteration is proposed. Based on this model, a micro-blog post clustering algorithm is present. The proposed algorithm takes use of the partition idea and avoids the influence of noise data. Experiment shows the feasible of the proposed approach, and some existing problems and further works are also presented in the end.
Keywords
Big Data; Web sites; data mining; feature selection; recommender systems; social networking (online); big data; data mining; intelligent processing; iteration feature selection; microblog post clustering algorithm; text feature selection model; Clustering algorithms; Data mining; Data models; Feature extraction; Noise; Pragmatics; Vectors; Micro-blog; data mining; feature selection; text cluster;
fLanguage
English
Publisher
ieee
Conference_Titel
Modelling, Identification & Control (ICMIC), 2014 Proceedings of the 6th International Conference on
Conference_Location
Melbourne, VIC
Type
conf
DOI
10.1109/ICMIC.2014.7020768
Filename
7020768
Link To Document