DocumentCode :
3289874
Title :
An Improved CURD Algorithm for Source Code Mining
Author :
Liu, Yangyang ; Zhang, Yang
Author_Institution :
Coll. of Inf. Eng., Northwest A&F Univ., Yangling
Volume :
4
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
335
Lastpage :
339
Abstract :
Source code mining algorithm should have the ability to cope with large volume of data and nominal attributes, which are two major characteristics of source code dataset. K-means algorithm is not suitable for clustering source code as it is generally difficult for the users to determine the count of clusters for a previously unknown dataset. CURD clustering algorithm works efficiently. However, it can´t process nominal attributes. In this paper, we propose NCURD algorithm for clustering source code by making CURD applicable to nominal attributes, and by improving the working efficiency of CURD. The experimental results show that NCURD algorithm has excellent clustering performance for clustering source code.
Keywords :
data mining; pattern clustering; software engineering; NCURD algorithm; improved CURD algorithm; k-means algorithm; source code clustering; source code dataset; source code mining; Clustering algorithms; Data mining; Educational institutions; Fuzzy systems; Knowledge engineering; Shape;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Jinan Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.479
Filename :
4666408
Link To Document :
بازگشت