Title :
Adaptive hashing based multiple variable length pattern search algorithm for large data sets
Author :
Kanuga, Punit ; Chauhan, Anamika
Author_Institution :
Dept. of Inf. Technol., Delhi Technol. Univ. (formerly DCE), New Delhi, India
Abstract :
Searching of patterns in large data sets is need of the hour to extract knowledge from data warehouses. This paper presents a new hashing based algorithm for fast search of multiple variable length patterns in large data sets. It rules out traditional way of generation of shift table for each character present in pattern. It can also accommodate patterns which come up during search time, thus works well for both pre-determined as well as dynamic pattern set. Furthermore, its speed enhances as the minimum pattern length P increases for data set of length n taking O(n/P) time during search. Experimental results for runtime behavior of presented algorithm with varying parameters like number of patterns to be searched and length of data set extended upto (but not limited to) 200,000 characters are produced.
Keywords :
computational complexity; data handling; file organisation; pattern matching; adaptive hashing; hashing based algorithm; large data sets; multiple variable length pattern search algorithm; runtime behavior; Algorithm design and analysis; Data mining; Ear; Extremities; Heuristic algorithms; Pattern matching; Redundancy; Base String; ConPair; Hashing; Master Record; Match Table; Multiple Pattern Matching; Redundancy Check;
Conference_Titel :
Data Science & Engineering (ICDSE), 2014 International Conference on
Conference_Location :
Kochi
Print_ISBN :
978-1-4799-6870-1
DOI :
10.1109/ICDSE.2014.6974624