مرکز منطقه ای اطلاع رساني علوم و فناوري - The closest substring problem with small distances

DocumentCode :

2385863

Title :

The closest substring problem with small distances

Author :

Marx, Dániel

Author_Institution :

Dept. of Comput. Sci. & Inf. Theor., Budapest Univ. of Technol. & Econ., Hungary

fYear :

2005

fDate :

23-25 Oct. 2005

Firstpage :

Lastpage :

Abstract :

In the closest substring problem k strings s₁, ..., s_k are given, and the task is to find a string s of length L such that each string s_i, has a consecutive substring of length L whose distance is at most d from s. The problem is motivated by applications in computational biology. We present two algorithms that can be efficient for small fixed values of d and k: for some functions f and g, the algorithms have running time f(d) · n(O(log d)) and g(d,k) · n(O(log log k)), respectively. The second algorithm is based on connections with the extremal combinatorics of hypergraphs. The closest substring problem is also investigated from the parameterized complexity point of view. Answering an open question from (Evans et al., 2003; Fellows et al.; Gramm et al., 2003), we show that the problem is W[1] hard even if both d and k are parameters. It follows as a consequence of this hardness result that our algorithms are optimal in the sense that the exponent of n in the running time cannot be improved to o(log d) or to o(log log k) (modulo some complexity-theoretic assumptions). Another consequence is that the running time n^O(1ε4)/ of the approximation scheme for closest substring presented in (Li et al., 2002) cannot be improved to f(ε) · n^c, i.e. the ε has to appear in the exponent of n.

Keywords :

computational complexity; graph theory; approximation scheme; closest substring problem; complexity theory; computational biology; hypergraphs extremal combinatorics; paramererized complexity point; string searching; Approximation algorithms; Combinatorial mathematics; Computational biology; DNA; NP-hard problem; Pattern matching; Polynomials; Proteins; RNA; Sequences;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Foundations of Computer Science, 2005. FOCS 2005. 46th Annual IEEE Symposium on

Print_ISBN :

0-7695-2468-0

Type :

conf

DOI :

10.1109/SFCS.2005.70

Filename :

1530702

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2385863