Title :
A Method for Collecting Uighur Websites
Author_Institution :
Nat. Language Resource Monitoring & Res. Center Beijing, Minzu Univ. of China, Beijing, China
Abstract :
The URLs of Uighur Web site are complex and it leads to that it is difficult to collect Uighur Web site. Firstly, features of Uighur Web site are analyzed. Then, the method to collect Uighur Web site is introduced in three steps: collect the Web pages may be in Uighur first, judge whether the Web page is in Uighur or not, at last, find the URL of Uighur Web site using the URL of Uighur Web page. Using the method, about 1,000 Uighur Web site are collected.
Keywords :
Web sites; search engines; URL; Uighur Web page collection; Uighur Web site collection; Uighur Web site feature analysis; search engines; Educational institutions; HTML; Internet; Search engines; Standards; Web pages; Uighur websites; web page collecting; web page language;
Conference_Titel :
Intelligent Networks and Intelligent Systems (ICINIS), 2013 6th International Conference on
Conference_Location :
Shenyang
Print_ISBN :
978-1-4799-2808-8
DOI :
10.1109/ICINIS.2013.73