• DocumentCode
    182062
  • Title

    A k-anonymity method based on search engine query statistics for disaster impact statements

  • Author

    Oguri, Hiroki ; Sonehara, Noboru

  • Author_Institution
    Multidisciplinary Sch., Inf. Dept., NIFTY Corp., Grad. University for Adv. Studies, Tokyo, Japan
  • fYear
    2014
  • fDate
    8-12 Sept. 2014
  • Firstpage
    447
  • Lastpage
    454
  • Abstract
    Privacy is a major concern in the management of big data, especially for datasets that contain sensitive personal information. Personal information is frequently used in marketing analyses, and we can also use it to evaluate the damage situation at the time of a disaster. One model that is widely used to protect privacy is k-anonymity, which can be generally defined as a clustering method in which any record in a dataset is indistinguishable from at least (k-1) other records in the same dataset. Most approaches to k-anonymity suffer from huge information loss due to the abstraction of continuous numerical and categorical attributes that have a hierarchical structure. It is difficult to use conventional k-anonymity with actual Internet services because of the computational complexity and value loss stemming from the loss of information. In this paper, we propose an anonymous algorithm that can respond to both the marketing and disaster analyzing. In ordinary times, we can analyze personal data with this algorithm using SEM price, and in times of disaster, we ensure information anonymity according to the number of times a searched word appears and distribute only the necessary information. This approach makes it possible to calculate only the necessary data and to maintain a sufficient k-anonym zed level. Application of this method to actual data showed that using an index number of the occurrences of the search term makes it is possible to anonymize the information with preferentially partitioning disaster locations.
  • Keywords
    Big Data; Internet; query processing; search engines; statistics; Internet services; SEM price; anonymous algorithm; big data; computational complexity; disaster impact statements; index number; information anonymity; k-anonym zed level; k-anonymity method; partitioning disaster locations; personal data; privacy protection; search engine query statistics; search term; sensitive personal information; Availability; Security; Algorithm; Big Data mining; Big Data security; Privacy preserving; k-anonymity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Availability, Reliability and Security (ARES), 2014 Ninth International Conference on
  • Conference_Location
    Fribourg
  • Type

    conf

  • DOI
    10.1109/ARES.2014.68
  • Filename
    6980317