Title :
A k-anonymity method based on search engine query statistics for disaster impact statements
Author :
Oguri, Hiroki ; Sonehara, Noboru
Author_Institution :
Multidisciplinary Sch., Inf. Dept., NIFTY Corp., Grad. University for Adv. Studies, Tokyo, Japan
Abstract :
Privacy is a major concern in the management of big data, especially for datasets that contain sensitive personal information. Personal information is frequently used in marketing analyses, and we can also use it to evaluate the damage situation at the time of a disaster. One model that is widely used to protect privacy is k-anonymity, which can be generally defined as a clustering method in which any record in a dataset is indistinguishable from at least (k-1) other records in the same dataset. Most approaches to k-anonymity suffer from huge information loss due to the abstraction of continuous numerical and categorical attributes that have a hierarchical structure. It is difficult to use conventional k-anonymity with actual Internet services because of the computational complexity and value loss stemming from the loss of information. In this paper, we propose an anonymous algorithm that can respond to both the marketing and disaster analyzing. In ordinary times, we can analyze personal data with this algorithm using SEM price, and in times of disaster, we ensure information anonymity according to the number of times a searched word appears and distribute only the necessary information. This approach makes it possible to calculate only the necessary data and to maintain a sufficient k-anonym zed level. Application of this method to actual data showed that using an index number of the occurrences of the search term makes it is possible to anonymize the information with preferentially partitioning disaster locations.
Keywords :
Big Data; Internet; query processing; search engines; statistics; Internet services; SEM price; anonymous algorithm; big data; computational complexity; disaster impact statements; index number; information anonymity; k-anonym zed level; k-anonymity method; partitioning disaster locations; personal data; privacy protection; search engine query statistics; search term; sensitive personal information; Availability; Security; Algorithm; Big Data mining; Big Data security; Privacy preserving; k-anonymity;
Conference_Titel :
Availability, Reliability and Security (ARES), 2014 Ninth International Conference on
Conference_Location :
Fribourg
DOI :
10.1109/ARES.2014.68