Author_Institution :
Dept. of Comput. Sci., Fed. Univ. of Sao Carlos, Sao Carlos, Brazil
Abstract :
An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in a dynamic way, be used to expand the scope and improve the performance of the learning task as a whole. The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), is applied to the task of autonomously building a knowledge base as a result of reading the web. Results reported so far reveal that very good results have been achieved when NELL is reading the web in English. When trying, however, to perform the same Machine Reading task (the task of reading the web) applied to web pages written in Portuguese, the previous reported approaches could not keep up with the good performance achieved in English. In this paper we describe an approach, different from previously proposed in the literature, and we present empirical results that corroborate the hypothesis that working on the preprocessing task of a sufficiently big corpus can be key to allow us to use the very same architecture proposed in NELL, but applied to the idea of reading the web in Portuguese (reading, and extracting knowledge from web pages written in Portuguese).
Keywords :
Internet; Web sites; learning (artificial intelligence); natural language processing; English; NELL; Portuguese; Web pages; World Wide Web; function approximation method; machine reading task; never-ending language learner principles; never-ending learning system; Blogs; ISO; Irrigation; Pipelines; Machine Learning; Never-Ending Learning; Read The Web;