Author/Authors :
Ribeiro، نويسنده , , Rita and Torgo، نويسنده , , Luيs، نويسنده ,
Abstract :
Algae blooms are ecological events associated with extremely high abundance value of certain algae. These rare events have a strong impact in the river’s ecosystem. In this context, the prediction of such events is of special importance. This paper addresses the problems that result from evaluating and comparing models at the prediction of rare extreme values using standard evaluation statistics. In this context, we describe a new evaluation statistic that we have proposed in Torgo and Ribeiro [Torgo, L., Ribeiro, R., 2006. Predicting rare extreme values. In: Ng, W., Kitsuregawa, M., Li, J., Chang, K. (Eds.), Proceedings of the 10th Pacific-Asia Conference on Knowledge Discover and Data Mining (PAKDD’2006). Springer, pp. 816–820 (number 3918 in LNAI)], which can be used to identify the best models for predicting algae blooms. We apply this new statistic in a comparative study involving several models for predicting the abundance of different groups of phytoplankton in water samples collected in Douro River, Porto, Portugal. Results show that the proposed statistic identifies a variant of a Support Vector Machine as outperforming the other models that were tried in the prediction of algae blooms.
Keywords :
Model evaluation , Algae blooms , Multiple regression , Extreme values prediction