Title of article :
Algorithmic complexity of protein identification: combinatorics of weighted strings Original Research Article
Author/Authors :
Mark Cieliebak، نويسنده , , Thomas Erlebach، نويسنده , , Zsuzsanna Lipt?k، نويسنده , , Jens Stoye، نويسنده , , Emo Welzl، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2004
Pages :
20
From page :
27
To page :
46
Abstract :
We investigate a problem which arises in computational biology: Given a constant-size alphabet A with a weight function μ : A→N, find an efficient data structure and query algorithm solving the following problem: For a string σ over A and a weight M∈N, decide whether σ contains a substring with weight M, where the weight of a string is the sum of the weights of its letters (ONE-STRING MASS FINDING PROBLEM). If the answer is yes, then we may in addition require a witness, i.e., indices i⩽j such that the substring beginning at position i and ending at position j has weight M. We allow preprocessing of the string and measure efficiency in two parameters: storage space required for the preprocessed data and running time of the query algorithm for given M. We are interested in data structures and algorithms requiring subquadratic storage space and sublinear query time, where we measure the input size as the length n of the input string σ. Among others, we present two non-trivial efficient algorithms: LOOKUP solves the problem with O(n) storage space and O(n/log n) time; INTERVAL solves the problem for binary alphabets with O(n) storage space in O(log n) query time. We introduce other variants of the problem and sketch how our algorithms may be extended for these variants. Finally, we discuss combinatorial properties of weighted strings.
Keywords :
Computational biology , Weighted Strings , Protein Identification
Journal title :
Discrete Applied Mathematics
Serial Year :
2004
Journal title :
Discrete Applied Mathematics
Record number :
885809
Link To Document :
بازگشت