• DocumentCode
    419335
  • Title

    FigSearch: using maximum entropy classifier to categorize biological figures

  • Author

    Liu, Fang ; Jenssen, Tor-Kristian ; Nygaard, Vegard ; Sack, John ; Hovig, Eivind

  • Author_Institution
    Norwegian Radium Hosp., Oslo, Norway
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    476
  • Lastpage
    477
  • Abstract
    Figures in scientific papers represent an intuitive and concise way of knowledge presentation. With more attention being paid on full-text mining in bioinformatics, we initiated an effort of studying figures in full articles. FigSearch is a prototype figure legend indexing and classification system, using both text-mining and supervised machine learning. We defined schematic representations of protein interactions and signaling events as an interesting figure type. A maximum entropy classifier was used in categorizing each figure, by assigning an estimated likelihood, as being relevant/non-relevant according to our definition. One advantage of the maximum entropy principle is that it provides a probability of decision, instead of a binary assignment. In our pilot study, FigSearch showed satisfactory performance in a preliminary validation by domain experts. Such a system can be useful in applications such as for a publisher´s website, in bio-picture gallery constructions, or as an aid for other complicated text-mining projects.
  • Keywords
    biology computing; classification; data mining; entropy; indexing; learning (artificial intelligence); molecular biophysics; proteins; FigSearch; bioinformatics; biological figures; biopicture gallery constructions; classification system; figure legend indexing system; full-text mining; knowledge presentation; likelihood estimation; maximum entropy classifier; protein interactions; protein signaling events; publisher website; scientific papers; supervised machine learning; Bioinformatics; Cancer; Entropy; Hospitals; Indexing; Machine learning; Milling machines; Neoplasms; Proteins; Prototypes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332465
  • Filename
    1332465