Title :
Method of enriching queries by contextual information to approve of information retrieval system in Arabic
Author :
Souheyl Mallat;Houssem Abdellaoui;Mohsen Maraoui;Mounir Zrigui
Author_Institution :
LATICE Laboratory Research Department of Computer Science, University of Monastir, Tunisia
Abstract :
In this paper, we propose a method is to improve the performance of information retrieval systems (IRS) by increasing the selectivity of relevant documents on the web. Indeed, a significant number of relevant documents on the web are not returned by an IRS (specifically a search engine), because of the richness of natural language Arabics. For this purpose the search engine does not reach high performance and does not meet the needs of users. To remedy this problem, we propose a method of enrichment of the query. This method relies on many steps. First, identification of significant terms (simple and composed) present in the query. Then, generation of a descriptive list and its assignment to each term that has been identified as significant in the query. A descriptive list is a set of linguistic knowledge of different types (morphological, syntactic and semantic). In this paper we are interested in the statistical treatment, based on the similarity method. This method exploits the weighting functions of Salton TF-IDF and TF-IEF on the list generated in the previous step. TF-IDF function identifies relevant documents, while the TF-IEF´s role is to identify the relevant sentence. The terms of high weight (which are terms which may be correlated to the context of the response) are incorporated into the original query. The application of this method is based on a corpus of documents belonging to a closed domain.
Keywords :
"Semantics","Pragmatics","Search engines","Labeling","Statistical analysis","Air pollution"
Conference_Titel :
Information & Communication Technology and Accessibility (ICTA), 2015 5th International Conference on
DOI :
10.1109/ICTA.2015.7426926