Title : 
Finding Thai Web Pages in Foreign Web Spaces
         
        
            Author : 
Somboonviwat, Kulwadee ; Tamura, Takayuki ; Kitsuregawa, Masaru
         
        
            Author_Institution : 
The University of Tokyo, Japan
         
        
        
        
            Abstract : 
This paper proposes language specific web crawling (LSWC) as a method of creating large-scale language specific Web archives for countries with linguistic identities such as Thailand. The LSWC strategy for selectively gathering Thai web pages from virtually anywhere on the Web is derived based on the results of static analyses of the Thai Web graph. We evaluated the performance of the LSWC strategy using a web crawling simulator.
         
        
            Keywords : 
Buildings; Crawlers; Information technology; Large-scale systems; Libraries; Research and development; Space technology; Uniform resource locators; Web pages; Web server;
         
        
        
        
            Conference_Titel : 
Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on
         
        
            Conference_Location : 
Atlanta, GA, USA
         
        
            Print_ISBN : 
0-7695-2571-7
         
        
        
            DOI : 
10.1109/ICDEW.2006.60