مرکز منطقه ای اطلاع رساني علوم و فناوري - Evaluating a Cross-Language Semantically Enriched Search Engine

DocumentCode :

2891559

Title :

Evaluating a Cross-Language Semantically Enriched Search Engine

Author :

Zhuhadar, Leyla ; Nasraoui, Olfa

Author_Institution :

Dept. of Comput. Eng. & Comput. Sci., Univ. of Louisville, Louisville, KY, USA

fYear :

2010

fDate :

12-14 April 2010

Firstpage :

1074

Lastpage :

1079

Abstract :

This paper tackles the problem of a user who is capable of reading or using documents written in a specific language, but who is not fluent enough in this specific language to use the right query terms to find the document. The design of Cross-Language Information Retrieval systems started since 1969 by Gerard Salton who enhanced his SMART system to retrieve documents in multiple languages, English and Spanish; however, the translation process is still considered to be a challenging problem. This paper is devoted to the evaluation of a Cross-Language search engine that uses Natural Language Processing techniques as a means of improving the search process of documents provided by two languages, English and Spanish. The research is implemented and evaluated on a real platform HyperManyMedia at Western Kentucky University. The implementation of the Cross-Language search engine follows a synergistic approach between (1) A Thesaurus-based Approach and (2) A Corpus-based Approach. In the case of the Thesaurus-based Approach, we use a simple bilingual listing of terms, phrases, concepts, and subconcepts where the hierarchical structure of the ontology is used to define the relationship between concepts/subconcepts. Also, we use a specific terminology that captures the domain of E-learning; those terms are associated with college name, course name, and lecture name which is presented in two languages. In the case of the Corpus-based Approach, we use the Term Vector Translation approach; the goal is to find statistical information about term usage between the two languages using techniques which map sets of term weights from English to Spanish and vice-versa.

Keywords :

computer aided instruction; language translation; natural language processing; ontologies (artificial intelligence); query processing; search engines; thesauri; E-learning; HyperManyMedia; SMART system; Western Kentucky University; corpus-based approach; cross-language information retrieval systems; cross-language semantically enriched search engine; hierarchical ontology structure; natural language processing techniques; query terms; specific language; term vector translation; thesaurus-based approach; translation process; Information retrieval; Information technology; Knowledge engineering; Knowledge representation; Natural language processing; Natural languages; Ontologies; Search engines; Speech processing; Web mining; cross-language; evaluation; information retrieval; ontology; search engine; semantic web;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Technology: New Generations (ITNG), 2010 Seventh International Conference on

Conference_Location :

Las Vegas, NV

Print_ISBN :

978-1-4244-6270-4

Type :

conf

DOI :

10.1109/ITNG.2010.237

Filename :

5501488

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2891559