Title : 
Ranking Text Documents Based on Conceptual Difficulty Using Term Embedding and Sequential Discourse Cohesion
         
        
            Author : 
Jameel, Sakar ; Wai Lam ; Xiaojun Qian
         
        
            Author_Institution : 
Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China
         
        
        
        
        
        
        
            Abstract : 
We propose a novel framework for determining the conceptual difficulty of a domain-specific text document without using any external lexicon. Conceptual difficulty relates to finding the reading difficulty of domain-specific documents. Previous approaches to tackling domain-specific readability problem have heavily relied upon an external lexicon, which limits the scalability to other domains. Our model can be readily applied in domain-specific vertical search engines to re-rank documents according to their conceptual difficulty. We develop an unsupervised and principled approach for computing a term´s conceptual difficulty in the latent space. Our approach also considers transitions between the segments generated in sequence. It performs better than the current state-of-the-art comparative methods.
         
        
            Keywords : 
text analysis; domain specific readability problem; domain specific text document; domain specific vertical search engines; external lexicon; sequential discourse cohesion; term conceptual difficulty; term embedding; text document ranking; Conceptual Difficulty; K-means; LSI; Term Embedding;
         
        
        
        
            Conference_Titel : 
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
         
        
            Conference_Location : 
Macau
         
        
            Print_ISBN : 
978-1-4673-6057-9
         
        
        
            DOI : 
10.1109/WI-IAT.2012.235