• DocumentCode
    606014
  • Title

    Quasi-linear-time substring searching by q-gram distance

  • Author

    Hanada, Hirotoshi ; Kudo, Motoi ; Nakamura, A.

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo, Japan
  • fYear
    2012
  • fDate
    23-25 Oct. 2012
  • Firstpage
    544
  • Lastpage
    549
  • Abstract
    The q-gram distance dq(x, y) between two strings x and y is a string similarity measure correlated with a famous string distance: the edit distance. In addition, it can be computed much faster, in linear (O(|x|+|y|)) time, than the edit distance in quadratic (O(|x||y|)) time, where | · | denotes the string length. However, it does not mean that we can find all substrings of a text t similar to a pattern p in linear time. In this paper we will propose a searching algorithm achieving quasi-linear (O(|t| log |p| + |p|)) time by the q-gram distance.
  • Keywords
    computational complexity; string matching; text analysis; edit distance; linear (O(|x|+|y|)) time; q-gram distance; quadratic (O(|x||y|)) time; quasilinear (O(|t| log |p| + |p|)) time; quasilinear-time substring searching; searching algorithm; string length; string matching; string similarity measure; text substrings; approximate string matching; q-gram distance; string searching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Science and Service Science and Data Mining (ISSDM), 2012 6th International Conference on New Trends in
  • Conference_Location
    Taipei
  • Print_ISBN
    978-1-4673-0876-2
  • Type

    conf

  • Filename
    6528694