DocumentCode :
1699616
Title :
Performance evaluation of information retrieval models in bug localization on the method level
Author :
Alduailij, Mai ; Al-Duailej, Mona
Author_Institution :
Dept. of Comput. Sci., Western Michigan Univ., Kalamazoo, MI, USA
fYear :
2015
Firstpage :
305
Lastpage :
313
Abstract :
This study uses statistical inference to compare the performance of three text models used for bug localization in collaboration systems: Vector Space Model (VSM), Latent Semantic Indexing (LSI), and Latent Dirichlet Analysis (LDA) on the method level. After the three models are compared we confirm that VSM is the superior model. We then, point out which external factors i.e. methods lengths, queries lengths, methods documentation comments, products names and components names mentioned in bug reports affect VSM performance. We conclude that VSM performance is positively correlated with most of the tested factors. We believe our results can be helpful to: (i) text models developers, to understand the strengths and limitations of VSM for future development; (ii) bug localization programmers using classical VSM, to understand improved ways to prepare methods extracted from big data collaboration systems and (iii) bug reporters, to follow the most efficient methods presented in this work in reporting bugs to enhance the information retrieval process.
Keywords :
Big Data; inference mechanisms; program debugging; query processing; statistical analysis; text analysis; LDA; LSI; VSM; VSM performance evaluation; big data collaboration systems; bug localization programmers; bug reports; collaboration systems; documentation comments; information retrieval models; latent Dirichlet analysis; latent semantic indexing; method level bug localization; statistical inference; text model developer; vector space model; Big data; Computer bugs; Information retrieval; Large scale integration; Object oriented modeling; Sociology; Statistics; discovery, collection, and extraction of information in big data sources; machine learning methods applied to big data analytics; natural language processing methodologies; performance and benchmarking for big data processing and analytics; semantic content extraction and analytics languages and techniques; text categorization and topic recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Collaboration Technologies and Systems (CTS), 2015 International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4673-7647-1
Type :
conf
DOI :
10.1109/CTS.2015.7210439
Filename :
7210439
Link To Document :
بازگشت