Title :
Natural language parsing for fact extraction from source code
Author :
Nilsson, Jens ; Löwe, Welf ; Hall, Johan ; Nivre, Joakim
Author_Institution :
Sch. of Math. & Syst. Eng., Vaxjo Univ., Vaxjo
Abstract :
We present a novel approach to extract structural information from source code using state-of-the-art parser technologies for natural languages. The parser technology is robust in the sense that it guarantees to produce some output, entailing that even incomplete or incorrect source code as input will get some kind of analysis. This comes at the expense of possibly assigning a partially incorrect analysis for input free of errors. However, an evaluation on source codes of the Java, Python and C/C++ languages shows that the committed errors are few i.e., our accuracy is close to 100%. The error analysis indicates that the majority of the errors remaining are harmless.
Keywords :
grammars; source coding; error analysis; fact extraction; natural language parsing; source code; Computer languages; Data mining; Formal languages; Java; Mathematics; Natural languages; Performance analysis; Robustness; Systems engineering and theory; Training data;
Conference_Titel :
Program Comprehension, 2009. ICPC '09. IEEE 17th International Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4244-3998-0
Electronic_ISBN :
1092-8138
DOI :
10.1109/ICPC.2009.5090046