Title : 
Distributing Computationally Expensive Matching of Requirements to Capability Models
         
        
            Author : 
Vasquez, Reymonrod ; Verma, Kunal ; Kass, Alex
         
        
            Author_Institution : 
Accenture Technol. Labs., San Jose, CA, USA
         
        
        
        
        
        
            Abstract : 
In this paper, we present a distributed way to automatically map users´ requirements to reference process models. In a prior paper [9], we presented a tool called Process Model Requirements Gap Analyzer (ProcGap), which combines natural language processing, information retrieval, and semantic reasoning to automatically match and map textual requirements to domain-specific process models. Although the tool proved beneficial to users in reusing prior knowledge, by making it easy to use process models, the tool has one main drawback. It takes a long time to compare a very large requirements document, one that has a few thousand requirements, to a process model hierarchy with a few thousand capabilities. In this paper, we present how we solved this problem using Apache Hadoop. Apache Hadoop allows ProcGap to distribute matching task across several machines, increasing the tool´s performance and usability. We present the performance comparison of running ProcGap on a single-machine, and our distributed version.
         
        
            Keywords : 
document handling; grammars; information retrieval; natural language processing; Apache Hadoop; ProcGap; capability model; information retrieval; natural language processing; process model requirement gap analyzer; semantic reasoning; Computational modeling; Distributed databases; Manuals; Marketing and sales; Runtime; Semantics; Software; Hadoop; Map-Reduce; Natural Language Processing; document and text processing;
         
        
        
        
            Conference_Titel : 
Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
         
        
            Conference_Location : 
Palo Alto, CA
         
        
            Print_ISBN : 
978-1-4577-1648-5
         
        
            Electronic_ISBN : 
978-0-7695-4492-2
         
        
        
            DOI : 
10.1109/ICSC.2011.54