Title :
Designing Pooling Systems for Noisy High-Throughput Protein-Protein Interaction Experiments Using Boolean Compressed Sensing
Author :
Mourad, Ramy ; Dawy, Zaher ; Morcos, Faruck
Author_Institution :
Dept. of Electr. & Comput. Eng., American Univ. of Beirut, Beirut, Lebanon
Abstract :
Group testing, also known as pooling, is a common technique used in high-throughput experiments in molecular biology to significantly reduce the number of tests required to identify rare biological interactions while correcting for experimental noise. Central to the group testing problem are 1) a pooling design that lays out how items are grouped together into pools for testing and 2) a decoder that interprets the results of the tested pools, identifying the active compounds. In this work, we take advantage of decoder guarantees from the field of compressed sensing (CS) to address the problem of efficient and reliable detection of biological interaction in noisy high-throughput experiments. We also use efficient combinatorial algorithms from group testing as well as established measurement matrices from CS to create pooling designs. First, we formulate the group testing problem in terms of a Boolean CS framework. We then propose a low-complexity l1-norm decoder to interpret pooling test results and identify active compounds. We demonstrate the robustness of the proposed l1-norm decoder in simulated experiments with false-positive and false-negative error rates typical of high-throughput experiments. When benchmarked against the current state-of-the-art methods, the proposed l1-norm decoder provides superior error correction for the majority of the cases considered while being notably faster computationally. Additionally, we test the performance of the l1-norm decoder against a real experimental data set, where 12,675 prey proteins were screened against 12 bait proteins. Lastly, we study the impact of different sparse pooling design matrices on decoder performance and show that the shifted transversal design (STD) is the most suitable among the pooling designs surveyed for biological applications of CS.
Keywords :
Boolean algebra; bioinformatics; combinatorial mathematics; compressed sensing; matrix algebra; proteins; proteomics; Boolean compressed sensing framework; bait proteins; biological interaction detection; combinatorial algorithms; false-negative error rates; false-positive error rates; group testing problem; high-throughput experiments; low-complexity l1-norm decoder; molecular biology; noisy high-throughput protein-protein interaction experiments; pooling system design; prey proteins; shifted transversal design; sparse pooling design matrices; Algorithm design and analysis; Boolean functions; Compressed sensing; Decoding; Noise measurement; Proteins; Boolean compressed sensing; Protein-protein interactions; group testing; shifted transversal pooling design (STD); yeast two-hybrid (Y2H) experiments;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2013.129