DocumentCode :
3762542
Title :
Comparative study between Part-of-Speech and statistical methods of text extraction in the tourism domain
Author :
Guson P. Kuntarto;Fahmi L. Moechtar;Berkah I. Santoso;Irwan P. Gunawan
Author_Institution :
Information Systems Department, Universitas Bakrie, Jakarta, Indonesia 12920
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
In this paper, a comparison between two different text extraction methods is given, namely the linguistic (Part-of-Speech / POS) and statistical methods (Term Frequency Inverse Document Frequency / TF-IDF). Text extractions were performed as part of ontology population in the Indonesian tourism domain. This paper also contributes in creating a multimedia corpus from three different resources or websites of Balinese tourism domain. Performance of each method is evaluated by means of several relevance measures. It was found that the statistical method used gives higher relevance than the linguistic methods. We have analysed that this is due to the limitation of the reference terms used in the initial ontology from our previous research.
Keywords :
"Ontologies","Pragmatics","Sociology","Statistical analysis","Data mining","Engines"
Publisher :
ieee
Conference_Titel :
Information Technology Systems and Innovation (ICITSI), 2015 International Conference on
Print_ISBN :
978-1-4673-6663-2
Type :
conf
DOI :
10.1109/ICITSI.2015.7437675
Filename :
7437675
Link To Document :
بازگشت