• DocumentCode
    659609
  • Title

    Content-based assessment of the credibility of online healthcare information

  • Author

    Park, Mirang ; Sampathkumar, Hariprasad ; Bo Luo ; Xue-wen Chen

  • Author_Institution
    Electr. Eng. & Comput. Sci., Univ. of Kansas, Lawrence, KS, USA
  • fYear
    2013
  • fDate
    6-9 Oct. 2013
  • Firstpage
    51
  • Lastpage
    58
  • Abstract
    Currently, a large amount of data is produced in healthcare informatics due to the growth of web technologies like social networks, wikis, blogs and RSS feeds. However, not all health information provided online is trustworthy. Even though many experts are involved in publishing trusted information, it is difficult for the general population to determine the credibility of the information. Therefore, a reliable mechanism to automatically determine the trustworthiness of online healthcare information is highly desired. In this paper, we propose two novel approaches based on Topic Modeling and Hidden Markov Models (HMMs), that can be applied over a large volume of online healthcare data to assess its trustworthiness. Traditional Topic Modeling is solely based on the “bag-of-words” model, however, we also consider the semantics of the content to identify the underlying topics in a sentence. For the HMM approach, we built our trustworthy and suspicious models after analyzing the characteristics of sentences from such websites. Both methods perform well to assess the trustworthiness, however HMM is less sophisticated to capture the semantics of sentences. We evaluated our method on randomly chosen real dataset and are able to achieve about 90% accuracy in identifying the trustworthiness of the content.
  • Keywords
    Internet; content management; health care; hidden Markov models; trusted computing; HMM; Web technologies; bag-of-words model; content semantics; content trustworthiness identification; content-based assessment; healthcare informatics; hidden Markov model; online healthcare information credibility; online healthcare information trustworthiness; suspicious model; topic modeling; trustworthy model; Electronic publishing; Encyclopedias; Hidden Markov models; Internet; Medical services; Search engines; Big Data; Healthcare Informatics; Hidden Markov Model; Topic Discovery;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Big Data, 2013 IEEE International Conference on
  • Conference_Location
    Silicon Valley, CA
  • Type

    conf

  • DOI
    10.1109/BigData.2013.6691758
  • Filename
    6691758