مرکز منطقه ای اطلاع رساني علوم و فناوري - De-identification in natural language processing

DocumentCode :

270736

Title :

De-identification in natural language processing

Author :

Vincze, Veronika ; Farkas, RichaÌrd

Author_Institution :

MTA-SZTE Res. Group on Artificial Intell., Univ. of Szeged Szeged, Szeged, Hungary

fYear :

2014

fDate :

26-30 May 2014

Firstpage :

1300

Lastpage :

1303

Abstract :

Natural language processing (NLP) systems usually require a huge amount of textual data but the publication of such datasets is often hindered by privacy and data protection issues. Here, we discuss the questions of de-identification related to three NLP areas, namely, clinical NLP, NLP for social media and information extraction from resumes. We also illustrate how de-identification is related to named entity recognition and we argue that de-identification tools can be successfully built on named entity recognizers.

Keywords :

data privacy; natural language processing; NLP areas; NLP systems; data protection; information extraction; natural language processing; privacy protection; social media; textual data; Databases; Educational institutions; Electronic mail; Informatics; Information retrieval; Media; Natural language processing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014 37th International Convention on

Conference_Location :

Opatija

Print_ISBN :

978-953-233-081-6

Type :

conf

DOI :

10.1109/MIPRO.2014.6859768

Filename :

6859768

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=270736