Title : 
sCooL: A system for academic institution name normalization
         
        
            Author : 
Jacob, Florian ; Javed, Fahad ; Meng Zhao ; McNair, Matt
         
        
            Author_Institution : 
Classification R & D, CareerBuilder, Atlanta, GA, USA
         
        
        
        
        
        
            Abstract : 
Named Entity Normalization involves normalizing recognized entities to a concrete, unambiguous real world entity. Within the purview of the online job posting domain, academic institution name normalization provides a beneficial opportunity for CareerBuilder (CB). Accurate and detailed normalization of academic institutions are important to perform sophisticated labor market dynamics analysis. In this paper we present and discuss the design and the implementation of sCooL, an academic institution name normalization system designed to supplant the existing manually maintained mapping system at CB. We also discuss the specific challenges that led to the design of sCooL. sCooL leverages Wikipedia to create academic institution name mappings from a school database which is created from job applicant resumes posted on our website. The mappings created are utilized to build a database which is then used for normalization. sCooL provides the flexibility to integrate mappings collected from different curated and non-curated sources. The system is able to identify malformed data and K-12 schools from universities and colleges. We conduct an extensive comparative evaluation of the semi-automated sCooL system against the existing manual mapping implementation and show that sCooL provides better coverage with improved accuracy.
         
        
            Keywords : 
Web sites; educational institutions; human resource management; information retrieval; labour resources; CB; CareerBuilder; Wikipedia; academic institution name normalization; labor market dynamics analysis; named entity normalization; online job posting domain; sCooL system; school database; Cities and towns; Databases; Educational institutions; Electronic publishing; Encyclopedias; Internet; Lucene; Name Entity Recognition; School Normalization; Wikipedia;
         
        
        
        
            Conference_Titel : 
Collaboration Technologies and Systems (CTS), 2014 International Conference on
         
        
            Conference_Location : 
Minneapolis, MN
         
        
            Print_ISBN : 
978-1-4799-5157-4
         
        
        
            DOI : 
10.1109/CTS.2014.6867547