• DocumentCode
    1726589
  • Title

    Automatic Anonymization of Natural Languages Texts Posted on Social Networking Services and Automatic Detection of Disclosure

  • Author

    Nguyen-Son, Hoang-Quoc ; Nguyen, Quoc-Binh ; Tran, Minh-Triet ; Nguyen, Dinh-Thuc ; Yoshiura, Hiroshi ; Echizen, Isao

  • Author_Institution
    Univ. of Sci., Ho Chi Minh City, Vietnam
  • fYear
    2012
  • Firstpage
    358
  • Lastpage
    364
  • Abstract
    One approach to overcoming the problem of too much information about a user being disclosed on social networking services (by the user or by the user´s friends) through natural language texts (blogs, comments, status updates, etc.) is to anonymize the texts. However, determining which information is sensitive and should thus be anonymized is a challenging problem. Sensitive information is any information about a user that could be used to identify the user. We have developed an algorithm that anonymizes sensitive information in text to be posted by generalization. Synonyms for the anonymized information are used as fingerprints for detecting a discloser of the information. The fingerprints are quantified using the modified discernability metric to enable an appropriate level of anonymity to be used for each group of the user´s friends. The fingerprints cannot be converted into another one to incorrectly identify a person who has revealed sensitive information. Use of the algorithm to control the disclosure of information on Facebook demonstrated that it works well not only in social networking but also in other areas (health, religion, politics, military, etc.) that store sensitive information.
  • Keywords
    natural language processing; security of data; social networking (online); text analysis; Facebook; automatic anonymization; automatic detection; information disclosure; ingerprints; natural languages texts; sensitive information; social networking services; synonyms; Blogs; Educational institutions; Facebook; Measurement; Privacy; USA Councils; Generalization; Social networking service; Synonym; Text anonymous fingerprinting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Availability, Reliability and Security (ARES), 2012 Seventh International Conference on
  • Conference_Location
    Prague
  • Print_ISBN
    978-1-4673-2244-7
  • Type

    conf

  • DOI
    10.1109/ARES.2012.18
  • Filename
    6329205