Author_Institution :
David Cheriton Sch. of Comput. Sci., Univ. of Waterloo, Waterloo, QC, Canada
Abstract :
Often stakeholders, such as developers, managers, or buyers, want to find out what software development processes are being followed within a software project. Their reasons include: CMM and ISO 9000 compliance, process validation, management, acquisitions, and business intelligence. Recovering the software development processes from an existing project is expensive if one must rely upon manual inspection of artifacts and interviews of developers and their managers. Researchers have suggested live observation and instrumentation of a project to allow for more measurement, but this is costly, invasive, and also requires a live running project. Instead, we propose an after the fact analysis: software process recovery. This approach analyzes version control systems, bug trackers and mailing list archives using a variety of supervised and unsupervised techniques from machine learning, topic analysis, natural language processing and statistics. We can combine all of these methods to recover process events that we map back to software development processes like the Unified Process. We can produce diagrams called Recovered Unified Process Views (RUPV) that are similar to the Unified Process diagram, a time-line of effort per parallel discipline occurring across time. We then validate these methods using case studies of multiple open source software systems.
Keywords :
checkpointing; configuration management; learning (artificial intelligence); program debugging; public domain software; software engineering; artifact inspection; bug tracker; machine learning; natural language processing; open source software system; recovered unified process view; software development process; software process recovery; version control system; Control systems; Data mining; Documentation; Maintenance engineering; Programming; Software; USA Councils; mining software repositories; process; software development process;