• DocumentCode
    174546
  • Title

    Adaptive hashing based multiple variable length pattern search algorithm for large data sets

  • Author

    Kanuga, Punit ; Chauhan, Anamika

  • Author_Institution
    Dept. of Inf. Technol., Delhi Technol. Univ. (formerly DCE), New Delhi, India
  • fYear
    2014
  • fDate
    26-28 Aug. 2014
  • Firstpage
    130
  • Lastpage
    135
  • Abstract
    Searching of patterns in large data sets is need of the hour to extract knowledge from data warehouses. This paper presents a new hashing based algorithm for fast search of multiple variable length patterns in large data sets. It rules out traditional way of generation of shift table for each character present in pattern. It can also accommodate patterns which come up during search time, thus works well for both pre-determined as well as dynamic pattern set. Furthermore, its speed enhances as the minimum pattern length P increases for data set of length n taking O(n/P) time during search. Experimental results for runtime behavior of presented algorithm with varying parameters like number of patterns to be searched and length of data set extended upto (but not limited to) 200,000 characters are produced.
  • Keywords
    computational complexity; data handling; file organisation; pattern matching; adaptive hashing; hashing based algorithm; large data sets; multiple variable length pattern search algorithm; runtime behavior; Algorithm design and analysis; Data mining; Ear; Extremities; Heuristic algorithms; Pattern matching; Redundancy; Base String; ConPair; Hashing; Master Record; Match Table; Multiple Pattern Matching; Redundancy Check;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Science & Engineering (ICDSE), 2014 International Conference on
  • Conference_Location
    Kochi
  • Print_ISBN
    978-1-4799-6870-1
  • Type

    conf

  • DOI
    10.1109/ICDSE.2014.6974624
  • Filename
    6974624