DocumentCode :
3205060
Title :
Effects of term segmentation on Chinese/English cross-language information retrieval
Author :
Oard, Douglas W. ; Wang, Jianqiang
Author_Institution :
Coll. of Libr. & Inf. Services, Maryland Univ., College Park, MD, USA
fYear :
1999
fDate :
1999
Firstpage :
149
Lastpage :
157
Abstract :
The majority of recent Cross-Language Information Retrieval (CLIR) research has focused on European languages. CLIR problems that involve East Asian languages such as Chinese introduce additional challenges, because written Chinese texts lack boundaries between terms. The paper examines three Chinese segmentation techniques in combination with two variants of dictionary-based Chinese to English query translation. The results indicate that failure to segment terms, particularly technical terms and names, can have a cascading effect that reduces retrieval effectiveness. Task-tuned segmentation algorithms and alternative term weighting strategies are suggested as productive directions for future work
Keywords :
information retrieval; language translation; linguistics; natural languages; text analysis; CLIR problems; Chinese segmentation techniques; Chinese/English cross-language information retrieval; Cross-Language Information Retrieval research; East Asian languages; English query translation; European languages; alternative term weighting strategies; cascading effect; dictionary-based Chinese; future work; productive directions; retrieval effectiveness; task-tuned segmentation algorithms; technical terms; term segmentation; written Chinese texts; Chromium; Data mining; Dictionaries; Information retrieval; Natural language processing; Natural languages; Reactive power;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
String Processing and Information Retrieval Symposium, 1999 and International Workshop on Groupware
Conference_Location :
Cancun
Print_ISBN :
0-7695-0268-7
Type :
conf
DOI :
10.1109/SPIRE.1999.796590
Filename :
796590
Link To Document :
بازگشت