• DocumentCode
    172478
  • Title

    A hierarchical clustering method for big data oriented ciphertext search

  • Author

    Chi Chen ; Xiaojie Zhu ; Peisong Shen ; Jiankun Hu

  • Author_Institution
    State Key Lab. of Inf. Security, Inst. of Inf. Eng., Beijing, China
  • fYear
    2014
  • fDate
    April 27 2014-May 2 2014
  • Firstpage
    559
  • Lastpage
    564
  • Abstract
    Following the wide use of cloud services, the volume of data stored in the data center has experienced a dramatically growth which makes real-time information retrieval much more difficult than before. Furthermore, text information is usually encrypted before being outsourced to data centers in order to protect users´ data privacy. Current techniques to search on encrypted data do not perform well within such a massive data environment. In this paper, a hierarchical clustering method for ciphertext search within a big data environment is proposed. The proposed approach clusters the documents based on the minimum similarity threshold, and then partitions the resultant clusters into sub-clusters until the constraint on the maximum size of cluster is reached. In the search phase, this approach can reach a linear computational complexity against exponential size of document collection. In addition, retrieved documents have a better relationship with each other than traditional methods. An experiment has been conducted using the collection set built from the recent ten years´ IEEE INFOCOM publications, including about 3000 documents with nearly 5300 keywords. The results have validated our proposed approach.
  • Keywords
    Big Data; cloud computing; computational complexity; cryptography; data privacy; information retrieval; pattern clustering; text analysis; Big Data oriented ciphertext search; IEEE INFOCOM publications; cloud services; data center; document clustering; document collection; document retrieval; encrypted data; hierarchical clustering method; linear computational complexity; minimum similarity threshold; real-time information retrieval; subclusters; text information encryption; user data privacy protection; Big data; Conferences; Cryptography; Equations; Indexes; Servers; Vectors; ciphertext retrieval; cloud computing; hierarchical clustering; multi-keyword ranked search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communications Workshops (INFOCOM WKSHPS), 2014 IEEE Conference on
  • Conference_Location
    Toronto, ON
  • Type

    conf

  • DOI
    10.1109/INFCOMW.2014.6849292
  • Filename
    6849292