DocumentCode :
652082
Title :
Automatic Information Extraction in the Medical Domain by Cross-Lingual Projection
Author :
Ben Abacha, Asma ; Zweigenbaum, Pierre ; Max, Aurelien
Author_Institution :
Ressource Centre for Health Care Technol., Centre de Rech. Public Henri Tudor, Luxembourg, Luxembourg
fYear :
2013
fDate :
9-11 Sept. 2013
Firstpage :
82
Lastpage :
88
Abstract :
This research tackles the automatic annotation of texts written in a language L1 by exploiting resources and tools available for another language L2. Our approach involves the use of a parallel corpus (L1-L2) aligned at the level of sentences and words. To address the lack of annotated French corpus in the medical field, we focus on the French-English language pair to annotate French medical texts automatically. We focus in this article on Medical Entity Recognition (MER). We evaluate our MER method on the English corpus and the projection of the annotations on the French corpus. We also discuss the problem of scalability since we use a parallel corpus extracted from the Web and propose a statistical method to handle heterogeneous corpora.
Keywords :
medical computing; natural language processing; statistical analysis; text analysis; English corpus; French medical texts; French-English language; MER method; annotated French corpus; automatic annotation; automatic information extraction; cross lingual projection; medical domain; medical entity recognition; medical field; parallel corpus; statistical method; Diseases; Feature extraction; Information retrieval; Manuals; Medical diagnostic imaging; Semantics; cross-lingual projection; medical entity recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Healthcare Informatics (ICHI), 2013 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Type :
conf
DOI :
10.1109/ICHI.2013.25
Filename :
6680464
Link To Document :
بازگشت