Title of article :
A classification updating procedure motivated by high-content screening data
Author/Authors :
R. M. Jacques، نويسنده , , N. R.J. Fieller&E. K. Ainscow، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
Abstract :
The current paradigm for the identification of candidate drugs within the pharmaceutical industry typically
involves the use of high-throughput screens. High-content screening (HCS) is the term given to the process
of using an imaging platform to screen large numbers of compounds for some desirable biological activity.
Classification methods have important applications in HCS experiments, where they are used to predict
which compounds have the potential to be developed into new drugs. In this paper, a new classification
method is proposed for batches of compounds where the rule is updated sequentially using information from
the classification of previous batches. This methodology accounts for the possibility that the training data
are not a representative sample of the test data and that the underlying group distributions may change as
new compounds are analysed. This technique is illustrated on an example data set using linear discriminant
analysis, k-nearest neighbour and random forest classifiers. Random forests are shown to be superior to
the other classifiers and are further improved by the additional updating algorithm in terms of an increase
in the number of true positives as well as a decrease in the number of false positives.
Keywords :
Classification , Batch learning , high-content screening experiments , random forests , updating algorithm , k-nearest neighbour , linear discriminant analysis
Journal title :
JOURNAL OF APPLIED STATISTICS
Journal title :
JOURNAL OF APPLIED STATISTICS