مرکز منطقه ای اطلاع رساني علوم و فناوري - How to read the web in portuguese using the never-ending language learner´s principles

DocumentCode :

3581198

Title :

How to read the web in portuguese using the never-ending language learner´s principles

Author :

Duarte, Maisa C. ; Hruschka, Estevam R.

Author_Institution :

Dept. of Comput. Sci., Fed. Univ. of Sao Carlos, Sao Carlos, Brazil

fYear :

2014

Firstpage :

162

Lastpage :

167

Abstract :

An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in a dynamic way, be used to expand the scope and improve the performance of the learning task as a whole. The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), is applied to the task of autonomously building a knowledge base as a result of reading the web. Results reported so far reveal that very good results have been achieved when NELL is reading the web in English. When trying, however, to perform the same Machine Reading task (the task of reading the web) applied to web pages written in Portuguese, the previous reported approaches could not keep up with the good performance achieved in English. In this paper we describe an approach, different from previously proposed in the literature, and we present empirical results that corroborate the hypothesis that working on the preprocessing task of a sufficiently big corpus can be key to allow us to use the very same architecture proposed in NELL, but applied to the idea of reading the web in Portuguese (reading, and extracting knowledge from web pages written in Portuguese).

Keywords :

Internet; Web sites; learning (artificial intelligence); natural language processing; English; NELL; Portuguese; Web pages; World Wide Web; function approximation method; machine reading task; never-ending language learner principles; never-ending learning system; Blogs; ISO; Irrigation; Pipelines; Machine Learning; Never-Ending Learning; Read The Web;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Systems Design and Applications (ISDA), 2014 14th International Conference on

Print_ISBN :

978-1-4799-7937-0

Type :

conf

DOI :

10.1109/ISDA.2014.7066260

Filename :

7066260

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3581198