Title :
Capturing Java naming conventions with first-order Markov models
Author :
Linstead, Erik ; Hughes, Lindsey ; Lopes, Cristina ; Baldi, Pierre
Author_Institution :
Sch. of Inf. & Comput. Sci., Univ. of California, Irvine, CA
Abstract :
We analyze naming conventions for classes, interfaces, methods, and fields across 12,151 open-source Java projects. This vocabulary data is then used to train first-order Markov models to classify entity names, as well as to assess adherence to common naming structure. Preliminary results yield an accuracy of 78.34%. Supplementary material may be found at: http://sourcerer.ics.uci.edu/icpc2009/icpc.html.
Keywords :
Java; Markov processes; data mining; naming services; pattern classification; public domain software; reverse engineering; text analysis; Java naming convention; entity name classification; first-order Markov model; open-source Java software development; program comprehension; statistical text mining; vocabulary differences; Computer science; Frequency; Internet; Java; Open source software; Probability distribution; Relational databases; Text mining; Training data; Vocabulary;
Conference_Titel :
Program Comprehension, 2009. ICPC '09. IEEE 17th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4244-3998-0
Electronic_ISBN :
1092-8138
DOI :
10.1109/ICPC.2009.5090074