Title :
Comparative study between Part-of-Speech and statistical methods of text extraction in the tourism domain
Author :
Guson P. Kuntarto;Fahmi L. Moechtar;Berkah I. Santoso;Irwan P. Gunawan
Author_Institution :
Information Systems Department, Universitas Bakrie, Jakarta, Indonesia 12920
Abstract :
In this paper, a comparison between two different text extraction methods is given, namely the linguistic (Part-of-Speech / POS) and statistical methods (Term Frequency Inverse Document Frequency / TF-IDF). Text extractions were performed as part of ontology population in the Indonesian tourism domain. This paper also contributes in creating a multimedia corpus from three different resources or websites of Balinese tourism domain. Performance of each method is evaluated by means of several relevance measures. It was found that the statistical method used gives higher relevance than the linguistic methods. We have analysed that this is due to the limitation of the reference terms used in the initial ontology from our previous research.
Keywords :
"Ontologies","Pragmatics","Sociology","Statistical analysis","Data mining","Engines"
Conference_Titel :
Information Technology Systems and Innovation (ICITSI), 2015 International Conference on
Print_ISBN :
978-1-4673-6663-2
DOI :
10.1109/ICITSI.2015.7437675