DocumentCode
174546
Title
Adaptive hashing based multiple variable length pattern search algorithm for large data sets
Author
Kanuga, Punit ; Chauhan, Anamika
Author_Institution
Dept. of Inf. Technol., Delhi Technol. Univ. (formerly DCE), New Delhi, India
fYear
2014
fDate
26-28 Aug. 2014
Firstpage
130
Lastpage
135
Abstract
Searching of patterns in large data sets is need of the hour to extract knowledge from data warehouses. This paper presents a new hashing based algorithm for fast search of multiple variable length patterns in large data sets. It rules out traditional way of generation of shift table for each character present in pattern. It can also accommodate patterns which come up during search time, thus works well for both pre-determined as well as dynamic pattern set. Furthermore, its speed enhances as the minimum pattern length P increases for data set of length n taking O(n/P) time during search. Experimental results for runtime behavior of presented algorithm with varying parameters like number of patterns to be searched and length of data set extended upto (but not limited to) 200,000 characters are produced.
Keywords
computational complexity; data handling; file organisation; pattern matching; adaptive hashing; hashing based algorithm; large data sets; multiple variable length pattern search algorithm; runtime behavior; Algorithm design and analysis; Data mining; Ear; Extremities; Heuristic algorithms; Pattern matching; Redundancy; Base String; ConPair; Hashing; Master Record; Match Table; Multiple Pattern Matching; Redundancy Check;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Science & Engineering (ICDSE), 2014 International Conference on
Conference_Location
Kochi
Print_ISBN
978-1-4799-6870-1
Type
conf
DOI
10.1109/ICDSE.2014.6974624
Filename
6974624
Link To Document