DocumentCode :
270736
Title :
De-identification in natural language processing
Author :
Vincze, Veronika ; Farkas, Richárd
Author_Institution :
MTA-SZTE Res. Group on Artificial Intell., Univ. of Szeged Szeged, Szeged, Hungary
fYear :
2014
fDate :
26-30 May 2014
Firstpage :
1300
Lastpage :
1303
Abstract :
Natural language processing (NLP) systems usually require a huge amount of textual data but the publication of such datasets is often hindered by privacy and data protection issues. Here, we discuss the questions of de-identification related to three NLP areas, namely, clinical NLP, NLP for social media and information extraction from resumes. We also illustrate how de-identification is related to named entity recognition and we argue that de-identification tools can be successfully built on named entity recognizers.
Keywords :
data privacy; natural language processing; NLP areas; NLP systems; data protection; information extraction; natural language processing; privacy protection; social media; textual data; Databases; Educational institutions; Electronic mail; Informatics; Information retrieval; Media; Natural language processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014 37th International Convention on
Conference_Location :
Opatija
Print_ISBN :
978-953-233-081-6
Type :
conf
DOI :
10.1109/MIPRO.2014.6859768
Filename :
6859768
Link To Document :
بازگشت