DocumentCode :
3248677
Title :
Extracting problematic API features from forum discussions
Author :
Yingying Zhang ; Daqing Hou
Author_Institution :
Dept. of Comput. Sci., Clarkson Univ., Potsdam, NY, USA
fYear :
2013
fDate :
20-21 May 2013
Firstpage :
142
Lastpage :
151
Abstract :
Software engineering activities often produce large amounts of unstructured data. Useful information can be extracted from such data to facilitate software development activities, such as bug reports management and documentation provision. Online forums, in particular, contain extensive valuable information that can aid in software development. However, no work has been done to extract problematic API features from online forums. In this paper, we investigate ways to extract problematic API features that are discussed as a source of difficulty in each thread, using natural language processing and sentiment analysis techniques. Based on a preliminary manual analysis of the content of a discussion thread and a categorization of the role of each sentence therein, we decide to focus on a negative sentiment sentence and its close neighbors as a unit for extracting API features. We evaluate a set of candidate solutions by comparing tool-extracted problematic API design features with manually produced golden test data. Our best solution yields a precision of 89%. We have also investigated three potential applications for our feature extraction solution: (i) highlighting the negative sentence and its neighbors to help illustrate the main API feature; (ii) searching helpful online information using the extracted API feature as a query; (iii) summarizing the problematic features to reveal the “hot topics” in a forum.
Keywords :
application program interfaces; feature extraction; natural language processing; query processing; software engineering; discussion thread; forum discussions; manually produced golden test data; natural language processing; negative sentiment sentence; online information; problematic API feature extraction; problematic feature summarization; sentiment analysis techniques; Data mining; Dictionaries; Feature extraction; Message systems; Pattern matching; Software; Tutorials; APIs; AWT/Swing; Design Feedback; Information Extraction; Online Forums;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Program Comprehension (ICPC), 2013 IEEE 21st International Conference on
Conference_Location :
San Francisco, CA
ISSN :
1063-6897
Type :
conf
DOI :
10.1109/ICPC.2013.6613842
Filename :
6613842
Link To Document :
بازگشت