DocumentCode
1637272
Title
Automatic Categorization of Software Libraries Using Bytecode
Author
Escobar-Avila, Javier
Author_Institution
Dept. of Comput. Sci., Florida State Univ., Tallahassee, FL, USA
Volume
2
fYear
2015
Firstpage
784
Lastpage
786
Abstract
Automatic software categorization is the task of assigning categories or tags to software libraries in order to summarize their functionality. Correctly assigning these categories is essential to ensure that relevant libraries can be easily retrieved by developers from large repositories. Current categorization approaches rely on the semantics reflected in the source code, or use supervised machine learning techniques, which require a set of labeled software as a training data. These approaches fail when such information is not available. We propose a novel unsupervised approach for the automatic categorization of Java libraries, which uses the bytecode of a library in order to determine its category. We show that the approach is able to successfully categorize libraries from the Apache Foundation Repository.
Keywords
Java; software libraries; source code (software); unsupervised learning; Apache Foundation Repository; Java libraries; automatic software library categorization; bytecode; source code; unsupervised approach; Accuracy; Conferences; Data mining; Semantics; Software; Software libraries; automatic labeling; bytecode; clustering; dirichlet process; software categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICSE.2015.249
Filename
7203070
Link To Document