Workshop on Mining Unstructured Data (MUD) ... Because "Mining Unstructured Data is Like Fishing in Muddy Waters"!

Author

Bacchelli, Alberto ; Bettenburg, Nicolas ; Guerrouj, Latifa

fYear

2012

fDate

15-18 Oct. 2012

Firstpage

5

Lastpage

6

Abstract

Software developers have long been supported by a variety of tools, such as version control systems (e.g., GIT), issue tracking systems (e.g., BugZilla), and mailing list services (e.g., Mailman). These tools accumulate a wide range of information that is recorded in the repositories these tools store their data in. This information is comprised of two significantly different types of data: structured and unstructured data. Structured data (e.g., source code or execution traces) has a well-established structure and grammar, thus is straightforward to parse and use with computer machinery. Unstructured data (e.g., documentation, discussions, comments, or customer support requests) consists of a mixture of natural language text, snippets of structured data, and noise. Mining unstructured data is very challenging since out-of-the box approaches adopted from related fields, such as Natural Language Processing and Information Retrieval, cannot be directly applied in software engineering. To tackle challenges faced when mining unstructured data and make the knowledge contained in unstructured data repositories accessible to both practitioners and researchers, we organize the 2nd workshop on Mining Unstructured Data (MUD´12). The aim is to provide a unique interactive venue for discussing in-depth challenges, approaches, and applications and share experiences, and results on the topic of mining software unstructured data.

Keywords

Conferences; Data mining; Educational institutions; Humans; Natural language processing; Software; Software engineering; Mining; Unstructured Data; Workshop;

fLanguage

English

Publisher

ieee

Conference_Titel

Reverse Engineering (WCRE), 2012 19th Working Conference on

Conference_Location

Kingston, ON, Canada

ISSN

1095-1350

Print_ISBN

978-1-4673-4536-1

Type

conf

DOI

10.1109/WCRE.2012.67

Filename

6385096