• DocumentCode
    1880997
  • Title

    Speeding up pattern matching by optimal partial string extraction

  • Author

    Tan Jianlong ; Liu Xia ; Liu Yanbing ; Liu Ping

  • Author_Institution
    Inst. of Comput. Technol., Chinese Acad. of Sci., Beijing, China
  • fYear
    2011
  • fDate
    10-15 April 2011
  • Firstpage
    1030
  • Lastpage
    1035
  • Abstract
    String matching plays a key role in web content monitoring systems. Suffix matching algorithms have good time efficiency, and thus are widely used. These algorithms require that all patterns in a set have the same length. When the patterns cannot satisfy this requirement, the leftmost characters, m being the length of the shortest pattern, are extracted to construct the data structure. We call such m-character strings partial strings. However, a simple extraction from the left does not address the impact of partial string locations on search speed. We propose a novel method to extract the partial strings from each pattern which maximizes search speed. More specifically, with this method we can compute all the corresponding searching time cost by theoretical derivation, and choose the location which yields an approximately minimal search time. We evaluate our method on two rule sets: Snort and ClamAV. Experiments show that in most cases, our method achieves the fastest searching speed in all possible locations of partial string extraction, and is about 5%-20% faster than the alternative methods.
  • Keywords
    Internet; feature extraction; string matching; ClamAV; Internet; Snort; Web content monitoring system; optimal partial string extraction; pattern matching; searching time cost; string matching; suffix matching algorithm; Algorithm design and analysis; Automata; Bills of materials; Data mining; Data structures; Pattern matching; Training; SBOM algorithm; Wu Manber algorithm; multiple string matching; pattern matching; string matching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4577-0249-5
  • Electronic_ISBN
    978-1-4577-0248-8
  • Type

    conf

  • DOI
    10.1109/INFCOMW.2011.5928778
  • Filename
    5928778