Abstract :
molecules are the basic and most important
players of life. Molecules are composed of
atoms, and each molecule has two main
specifications or inheritances that are
structures, and in biomedical applications, biological
function. This dichotomy can be compared to living
organisms or humans whom possess genotype and
phenotype. Each of the two main inherited elements
contains a variety of feature classes and sub-categories.
Due to ease of use, the scientific community focuses on
elaborating the structure, rather than the function. Of
course, this has also caused many debates on the priority
of structure over the function or vice versa.
To better understand the features of the structure that
could affect function and hence translate them into
therapeutic or diagnostic applications, we may look at
molecules through their properties that is an intermediary
step in the process, which mostly includes the physicochemical
aspects.
Atomic sense of numbers
All molecules are made from atoms. In a simple
molecule with two atoms, atoms A and B can bind to
form a molecule in two ways AB and BA. Linearly
speaking, both molecules are the same, but if there are
two fragments, with asymmetry or different binding
points, the two molecules would be different.
Now, if we have three atoms A, B, and C, there are
options ABC, ACB, BAC, BCA, CAB, and CBA. The
number of possibilities is determined by factorial of
number of members, which in here is 3! = 3×2×1 = 6. In
molecules that have symmetric fragments (group of
atoms), there are identical cases among the possibilities;
however, for bigger molecular fragments without
symmetry (such as amino acids), the factorial rule applies.
Non-linear case scenarios
In organic molecules according to the carbon orbitals
involved, the bond angles and hence the shape of the
molecule will divert from simple linear example. In such
cases, the number of possibilities will be affected by
geometrical (cis and trans) and special (stereochemical-D
and -L) arrangements and number of possibilities will
increase dramatically.
How many molecules are there?
To obtain a better sense of the number crunching, an
example is presented here. A group of scientists from the
University of Bern has worked on the total number of
possible organic molecules (Ruddigkeit, et al., 2012). The
chemical space involved for molecules of up to 17 atoms
of C, N, O, S, and halogens was calculated to be 166.4
billion entries. This forms the chemical universe database
(GDB)-17, containing many drugs and lead compounds,
and millions of isomers of known drugs. GDB-17 content,
when compared to known molecules in PubChem,
contains more nonaromatic heterocycles, stereoisomers,
and scaffold types.
Among the possibilities in the chemical space, the
bioactive ligands can be searched for by enumeration and
subsequent virtual screening. It has been shown that
almost all small molecules (>99.9%) have never been
synthesized (Reymond and Awale, 2012), and more work
has to be done for their preparation and laboratory testing.
GDBs have been generated for prospective drug
discovery purpose and are accessible.
Higher up
Organic molecules are the center piece of the
enumeration studies. They can be considered in groups as
they are intact, or they can be split into smaller segments
or fragments composing common moieties; then one or
more of their properties or identification tags be converted
into numbers to correlate with the function. In our recent
study (Sardari et al., 2016), utilizing combined pattern of
methods such as fragment-based de novo design, scoring,
similarity-based compound searching, and structure-based
docking, led to introducing seven in silico designed
compounds with antimycobacterial properties. Findings
derived from antimycobacterial tests and MTT assay