Title : 
An improved topic relevance algorithm for vertical search engines
         
        
            Author : 
Lv, Lin-tao ; Chen, Li-ping ; Zhou, Hong-fang
         
        
            Author_Institution : 
Inst. of Comput. Sci. & Eng., Xian Univ. of Technol., Xian
         
        
        
        
        
        
        
            Abstract : 
HITS algorithm is a famous topic distillation algorithm, but it has a drawback of topic drift. To tackle this problem, a new improved HITS algorithm is proposed by assigning appropriate weights to links according to the link value and topic similarity. Based on an analysis of web link structure, link value is calculated by web page authority degree; topic similarity of web pages is calculated by combining analysis of page content with HTML structure characteristics. Improved HITS algorithm combining link value with topic similarity highlights the difference of links and it assigns different weights to different links. Experiment results indicate that the proposed HITS algorithm can improve the relevance ratio by 13%-42%. Furthermore it can well control topic drift and enhance the accuracy of information collection. The proposed HITS algorithm can be applied in vertical search engines. It lays an important theoretical foundation for vertical search engines.
         
        
            Keywords : 
Web sites; hypermedia markup languages; relevance feedback; search engines; HITS algorithm; HTML structure characteristics; Web link structure; Web pages; hyperlink induced topic search; improved topic relevance algorithm; vertical search engines; Algorithm design and analysis; Computer science; Electronic mail; HTML; Information resources; Pattern analysis; Pattern recognition; Search engines; Wavelet analysis; Web pages; HITS; Hyperlink; Link Value; Topic Drift; Topic Similarity;
         
        
        
        
            Conference_Titel : 
Wavelet Analysis and Pattern Recognition, 2008. ICWAPR '08. International Conference on
         
        
            Conference_Location : 
Hong Kong
         
        
            Print_ISBN : 
978-1-4244-2238-8
         
        
            Electronic_ISBN : 
978-1-4244-2239-5
         
        
        
            DOI : 
10.1109/ICWAPR.2008.4635878