DocumentCode :
3706674
Title :
Similarity in Patient Support Forums Using TF-IDF and Cosine Similarity Metrics
Author :
Mohammad Alodadi;Vandana P. Janeja
Author_Institution :
Dept. of Inf. Syst., Univ. of Maryland Baltimore County, Baltimore, MD, USA
fYear :
2015
Firstpage :
521
Lastpage :
522
Abstract :
The IEEE International Conference on Healthcare Informatics 2015 (ICHI 2015) announced a challenge in healthcare domain that concerns the quality of health inquiries on social media. The problem of the challenge is to reduce the repetition of posts for patient support forums. This problem gradually becomes hard to control due to the increase of forum users and lack of research within the forum´s older posts. To address this problem we used a model that finds the similarity of forum posts using cosine similarity metric over the term frequency-inverse document frequency (TF-IDF). We applied our model on data that are provided by the challenge committee. We used three graduate students to annotate the data for us and find the agreement vote of similarity. The results of our model using cosine similarity and TF-IDF were improved over existing models that primarily use topic modeling approaches such as Latent dirichlet allocation (LDA), and Latent Semantic Index (LSI).
Keywords :
"Measurement","Medical services","Data models","Informatics","Conferences","Media","Indexes"
Publisher :
ieee
Conference_Titel :
Healthcare Informatics (ICHI), 2015 International Conference on
Type :
conf
DOI :
10.1109/ICHI.2015.99
Filename :
7349760
Link To Document :
بازگشت