مرکز منطقه ای اطلاع رساني علوم و فناوري - Character Code Conversion and Misspelled Word Processing in Uyghur, Kazak, Kyrgyz Multilingual Information Retrieval System

DocumentCode :

2352801

Title :

Character Code Conversion and Misspelled Word Processing in Uyghur, Kazak, Kyrgyz Multilingual Information Retrieval System

Author :

Tohti, Turdi ; Musajan, Winira ; Hamdulla, Askar

Author_Institution :

Sch. of Inf. Sci. & Eng., Xinjiang Univ., Urumqi

fYear :

2008

fDate :

23-25 July 2008

Firstpage :

139

Lastpage :

144

Abstract :

The spelling errors often occur in the web pages or in the user query phrases, and the non-Unicode character coding scheme used by some of the Uyghur, Kazak, and Kyrgyz language based websites have a serious impact on recall and accuracy of Uyghur, Kazak, and Kyrgyz information retrieval system (UKKIRS). In this paper, studied and proposed the most effective solutions and ideas for above actual problems: in view of the problem of character coding varieties, proposed a character code conversion method from the non-Unicode to Unicode; For spelling errors, proposed a reconstruction and a root-expansion method based on user query phrases. The experimental results indicated that, the proposed algorithms solved well the problems mentioned above, and are very dedicated to this UKKIRS.

Keywords :

Web sites; information retrieval systems; natural language processing; word processing; UKKIRS; Uyghur, Kazak, and Kyrgyz information retrieval system; Uyghur, Kazak, and Kyrgyz language based websites; character code conversion; misspelled word processing; multilingual information retrieval system; non unicode character coding scheme; user query phrases; web pages; Code standards; Information analysis; Information retrieval; Information science; Information technology; Natural languages; Query processing; Text processing; Web pages; Writing; Candidate Suggestion; Character coding; Code conversion; Root expansion;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on

Conference_Location :

Dalian Liaoning

Print_ISBN :

978-0-7695-3273-8

Type :

conf

DOI :

10.1109/ALPIT.2008.95

Filename :

4584356

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2352801