DocumentCode :
3687651
Title :
Enhanced lexicon based model for web forum answer detection
Author :
Adekunle Isiaka Obasa;Naomie Salim;Atif Khan
Author_Institution :
Faculty of Computing, Universiti Teknologi Malaysia, Johor, Malaysia
fYear :
2015
Firstpage :
237
Lastpage :
243
Abstract :
A Web forum is an online community that connects people with common interest together. Within the forum, members interact to share knowledge, expertise and resources. A major issue in detecting web forum answers is to establish a good relationship between the question and the candidate answer. This relationship is often established using lexical features. Web forum text, unlike news articles, is faced with noise challenges, and this hinders the performance of lexical features. In this paper, we investigate the effect of noise on most of the common lexical features used in mining web forum answers with a view of normalizing it to enhance the performance of the features. We propose 13 lexical features for exploration. These features belong to four different quality dimensions that can guarantee good answers. We empirically address the following questions in the paper. What category of noise is more rampant in web forum? What lexical mining features are more susceptible to noise? Will normalization of forum corpus enhance the performance of lexical features in detecting web forum answers? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that proper normalization of web forum corpora can yield up to 9% increase in the performance of the lexical features.
Keywords :
"Feature extraction","Message systems","Support vector machines","Computer aided manufacturing","Terminology","Image color analysis","Color"
Publisher :
ieee
Conference_Titel :
Digital Information Processing and Communications (ICDIPC), 2015 Fifth International Conference on
Type :
conf
DOI :
10.1109/ICDIPC.2015.7323035
Filename :
7323035
Link To Document :
بازگشت