Title :
Content-based assessment of the credibility of online healthcare information
Author :
Park, Mirang ; Sampathkumar, Hariprasad ; Bo Luo ; Xue-wen Chen
Author_Institution :
Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
Abstract :
Currently, a large amount of data is produced in healthcare informatics due to the growth of web technologies like social networks, wikis, blogs and RSS feeds. However, not all health information provided online is trustworthy. Even though many experts are involved in publishing trusted information, it is difficult for the general population to determine the credibility of the information. Therefore, a reliable mechanism to automatically determine the trustworthiness of online healthcare information is highly desired. In this paper, we propose two novel approaches based on Topic Modeling and Hidden Markov Models (HMMs), that can be applied over a large volume of online healthcare data to assess its trustworthiness. Traditional Topic Modeling is solely based on the “bag-of-words” model, however, we also consider the semantics of the content to identify the underlying topics in a sentence. For the HMM approach, we built our trustworthy and suspicious models after analyzing the characteristics of sentences from such websites. Both methods perform well to assess the trustworthiness, however HMM is less sophisticated to capture the semantics of sentences. We evaluated our method on randomly chosen real dataset and are able to achieve about 90% accuracy in identifying the trustworthiness of the content.
Keywords :
Internet; content management; health care; hidden Markov models; trusted computing; HMM; Web technologies; bag-of-words model; content semantics; content trustworthiness identification; content-based assessment; healthcare informatics; hidden Markov model; online healthcare information credibility; online healthcare information trustworthiness; suspicious model; topic modeling; trustworthy model; Electronic publishing; Encyclopedias; Hidden Markov models; Internet; Medical services; Search engines; Big Data; Healthcare Informatics; Hidden Markov Model; Topic Discovery;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691758