• DocumentCode
    2767916
  • Title

    Mining the Relationship between Gene and Disease from Literature

  • Author

    Xu, Yan ; Chang, Zhiqiang ; Hu, Wen ; Yu, Lili ; DuanMu, Huizi ; Li, Xia

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
  • Volume
    7
  • fYear
    2009
  • fDate
    14-16 Aug. 2009
  • Firstpage
    482
  • Lastpage
    486
  • Abstract
    Text mining refers to extract high-quality information including entities and relationships between them from text. Although several methods have been applied to extract protein interaction relationships and other information, few researches have focused on dealing with sentences for extracting precise relationships. This paper has provided several strategies in the processes of filtering the sentences which contain non-positive relationships, then using the pattern of entities and relationship phrases to extract the relationships between gene and disease. We selected abstracts associated with ¿receptor¿, using 1000 sentences which contain the entity names and relationship phrases as the test set, the results show that the method achieved a precision of 84.6%, a recall of 77. 5% and an F-score of 80.9%. Moreover, we analyzed the usual problems which might happen in the process of extracting the relationships frequently.
  • Keywords
    data mining; diseases; genetics; text analysis; word processing; gene-disease relationship; high-quality information extraction; protein interaction relationships; text mining; Abstracts; Computer science; Data mining; Databases; Diseases; Drugs; Educational institutions; Fuzzy systems; Proteins; Text mining; Relationship Extracting; Sentence Spliting; Sentence filtering; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
  • Conference_Location
    Tianjin
  • Print_ISBN
    978-0-7695-3735-1
  • Type

    conf

  • DOI
    10.1109/FSKD.2009.42
  • Filename
    5360057