Title : 
Comparing methods to extract technical content for technological intelligence
         
        
            Author : 
Newman, Nils C. ; Porter, Alan L. ; Newman, David ; Courseault, Cherie ; Bolan, Stephanie D.
         
        
            Author_Institution : 
IISC, Atlanta, GA, USA
         
        
        
            fDate : 
July 29 2012-Aug. 2 2012
         
        
        
        
            Abstract : 
We are developing indicators for the emergence of science and technology (S&T) topics. We are targeting various S&T information resources, including metadata (i.e., bibliographic information) and full text. We explore alternative text analysis approaches - principal components analysis (PCA) and topic modeling - to extract technical topic information. We analyze the topical content to pursue potential applications and innovation pathways. In this presentation we compare alternative ways of consolidating messy sets of key terms [e.g., using Natural Language Processing (NLP) on abstracts and titles, together with various keyword sets]. Our process includes combinations of stopword removal, fuzzy term matching, association rules, and tf-idf weighting. We compare PCA results to topic modeling results. Our key test set consists of 4104 Web of Science records on Dye-Sensitized Solar Cells (DSSCs). Results suggest good potential to enhance our technical intelligence payoffs from database searches on topics of interest.
         
        
            Keywords : 
content-based retrieval; data mining; meta data; principal component analysis; scientific information systems; PCA; S&T information resources; alternative text analysis; association rules; bibliographic information; fuzzy term matching; metadata; principal components analysis; science and technology topics; stopword removal; technical content extraction; technological intelligence; tf-idf weighting; topic modeling; Abstracts; Clustering algorithms; Decision support systems; Electrodes; Films; Photovoltaic cells; Principal component analysis;
         
        
        
        
            Conference_Titel : 
Technology Management for Emerging Technologies (PICMET), 2012 Proceedings of PICMET '12:
         
        
            Conference_Location : 
Vancouver, BC
         
        
            Print_ISBN : 
978-1-4673-2853-1