Title :
Modeling the ownership of source code topics
Author :
Corley, Christopher S. ; Kammer, Elizabeth A. ; Kraft, Nicholas A.
Author_Institution :
Dept. of Comput. Sci., Univ. of Alabama, Tuscaloosa, AL, USA
Abstract :
Exploring linguistic topics in source code is a program comprehension activity that shows promise in helping a developer to become familiar with an unfamiliar software system. Examining ownership in source code can reveal complementary information, such as who to contact with questions regarding a source code entity, but the relationship between linguistic topics and ownership is an unexplored area. In this paper we combine software repository mining and topic modeling to measure the ownership of linguistic topics in source code. We conduct an exploratory study of the relationship between linguistic topics and ownership in source code using 10 open source Java systems. We find that classes that belong to the same linguistic topic tend to have similar ownership characteristics, which suggests that conceptually related classes often share the same owner(s). We also find that similar topics tend to share the same ownership characteristics, which suggests that the same developers own related topics.
Keywords :
Java; data mining; linguistics; public domain software; reverse engineering; source coding; linguistic topics; open source Java systems; ownership examination; ownership modelling; program comprehension activity; software repository mining; source code topics; topic modeling; Correlation; Data mining; Java; Pragmatics; Probability distribution; Resource management; Software; mining software repositories; ownership; pachinko allocation model; program comprehension; topic modeling;
Conference_Titel :
Program Comprehension (ICPC), 2012 IEEE 20th International Conference on
Conference_Location :
Passau
Print_ISBN :
978-1-4673-1213-4
Electronic_ISBN :
1092-8138
DOI :
10.1109/ICPC.2012.6240485