Content-based assessment of the credibility of online healthcare information

Author

Park, Mirang ; Sampathkumar, Hariprasad ; Bo Luo ; Xue-wen Chen

Author_Institution

Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA

fYear

2013

fDate

6-9 Oct. 2013

Firstpage

51

Lastpage

58

Abstract

Currently, a large amount of data is produced in healthcare informatics due to the growth of web technologies like social networks, wikis, blogs and RSS feeds. However, not all health information provided online is trustworthy. Even though many experts are involved in publishing trusted information, it is difficult for the general population to determine the credibility of the information. Therefore, a reliable mechanism to automatically determine the trustworthiness of online healthcare information is highly desired. In this paper, we propose two novel approaches based on Topic Modeling and Hidden Markov Models (HMMs), that can be applied over a large volume of online healthcare data to assess its trustworthiness. Traditional Topic Modeling is solely based on the “bag-of-words” model, however, we also consider the semantics of the content to identify the underlying topics in a sentence. For the HMM approach, we built our trustworthy and suspicious models after analyzing the characteristics of sentences from such websites. Both methods perform well to assess the trustworthiness, however HMM is less sophisticated to capture the semantics of sentences. We evaluated our method on randomly chosen real dataset and are able to achieve about 90% accuracy in identifying the trustworthiness of the content.

Keywords

Internet; content management; health care; hidden Markov models; trusted computing; HMM; Web technologies; bag-of-words model; content semantics; content trustworthiness identification; content-based assessment; healthcare informatics; hidden Markov model; online healthcare information credibility; online healthcare information trustworthiness; suspicious model; topic modeling; trustworthy model; Electronic publishing; Encyclopedias; Hidden Markov models; Internet; Medical services; Search engines; Big Data; Healthcare Informatics; Hidden Markov Model; Topic Discovery;

fLanguage

English

Publisher

ieee

Conference_Titel

Big Data, 2013 IEEE International Conference on

Conference_Location

Silicon Valley, CA

Type

conf

DOI

10.1109/BigData.2013.6691758

Filename

6691758