DocumentCode :
3696649
Title :
Event information extraction from Indonesian tweets using conditional random field
Author :
Fawwaz Muhammad;Masayu Leylia Khodra
Author_Institution :
School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
Information extraction is a process to find structured text from unstructured or semi-structured text. This research has an objective to build an information extraction system specialized for Events in Indonesian tweets. The system consists of two main parts. First part filters relevant tweet from irrelevant tweet. This part is only using a rule based approach with additional bag of words feature and gets the best accuracy of 86%. The second part is doing the extraction process. From our experiments, we get the best combination for extractor module by using multi token tokenization method, all feature set and 1st Order Conditional Random Field. This combination result in average accuracy of 74% per token.
Keywords :
"Twitter","Feature extraction","Data mining","Tagging","Information filters"
Publisher :
ieee
Conference_Titel :
Advanced Informatics: Concepts, Theory and Applications (ICAICTA), 2015 2nd International Conference on
Print_ISBN :
978-1-4673-8142-0
Type :
conf
DOI :
10.1109/ICAICTA.2015.7335383
Filename :
7335383
Link To Document :
بازگشت