• DocumentCode
    3758934
  • Title

    Lexical Characteristics Analysis of Chinese Clinical Documents

  • Author

    Meizhi Ju;Huilong Duan;Haomin Li

  • Author_Institution
    Coll. of Biomed. Eng. &
  • fYear
    2015
  • Firstpage
    121
  • Lastpage
    125
  • Abstract
    Understanding lexical characteristics of clinical documents is the foundation of sublanguage based Medical Language Processing (MLP) approach. However, there are limited studies focused on the lexical characters of Chinese clinical documents. In this study, a lexical characteristics analysis on both syntactic and semantic levels was conducted in a clinical corpus which contains 3,500 clinical documents generated during daily practices. The analysis was based on the automatic tagging results of a lexicon-based part-of-speech (POS) and semantic tagging method. The medical lexicon contains 237,291 entries annotated with both semantic and syntactic classes. The normalized frequency of different terms, syntactic and semantic classes was calculated and visualized. Major contribution of this paper is providing a wide-coverage Chinese medical semantic lexicon and presenting the lexical characteristics of Chinese clinical documents. Both of these will lay a good foundation for sublanguage based MLP studies in China.
  • Keywords
    "Semantics","Syntactics","Medical diagnostic imaging","Tagging","Noise measurement","Grammar","Natural language processing"
  • Publisher
    ieee
  • Conference_Titel
    Information Technology in Medicine and Education (ITME), 2015 7th International Conference on
  • Type

    conf

  • DOI
    10.1109/ITME.2015.51
  • Filename
    7429111