Title :
Parallel phrase matching for cloud based security services
Author :
Kolenchery, Jose
Author_Institution :
Digital Surgicals Inc., Austin, TX, USA
Abstract :
Phrase matching is a technique used to detect virus patterns, data leakage, SPAM, and embedded URLs in messages. Depending on the length of the phrases, the complexity of detecting them increases. Presented is an algorithm that can be implemented efficiently using GPU by reducing the copy costs. The algorithm creates k-grams using collision resistant hashing techniques and matches the k-grams against k-grams of previously known patterns. The comparison is made efficient by comparing fixed sized hash of words instead of the words themselves. It scales to be used in cloud based security services where millions of phrases must be examined per second. It uses the host processor to perform sequential operations such as tokenization and CRC computation to convert words into fixed size phrase terms. Phrases comprising of one or more phrase terms are generated and tested in a massively parallel CUDA environment.
Keywords :
cloud computing; cryptography; graphics processing units; parallel processing; security of data; string matching; CRC computation; GPU; SPAM; cloud based security service; collision resistant hashing technique; data leakage; embedded URL; host processor; k-grams algorithm; parallel CUD A environment; parallel phrase matching; phrase term; sequential operation; virus pattern; Graphics processing unit; Indexes; Instruction sets; Kernel; Pattern matching; Security; CUDA; DLP; SPAM detection; cloud based security; data leakage detection; software as a service; virus pattern detection;
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2011 International Conference of
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-1195-4
DOI :
10.1109/SoCPaR.2011.6089291