Title :
Robust broad-scale benthic habitat mapping when training data is scarce
Author :
Ahsan, Nasir ; Williams, Stefan B. ; Pizarro, Oscar
Author_Institution :
Australian Center for Field Robot., Univ. of Sydney, Sydney, NSW, Australia
Abstract :
Understanding the distribution of habitat classes at broad-scales is of interest in marine park conservation and planning. Typically sites of interest can extend up to many hundreds of square kilometers. However, collecting ground truth data (optical imagery, towed video, grab samples, and etc.) over such broad scales is impractical, and only a small fraction of the sites can be sampled depending on budget constraints. Benthic habitat mapping involves learning the correlations between habitat classes derived from limited ground truth sampling of the seabed and its corresponding morphology and extrapolating these correlations to the entire site. One important issue with such approaches is that the correlations are learned on limited data, therefore, motivating the need to investigate robust techniques for learning the correlations and extrapolating them. In this paper we have motivated the use of the generative classifier Gaussian Mixture Models (GMM´s) for the task of benthic habitat mapping instead of discriminative models such as Classification Trees (CT´s - popular in the benthic habitat mapping literature) and Support Vector Machines (SVM´s - generally popular in a variety of fields) based on the idea that generative classifiers take into more information about the underlying data distribution than discriminative classifiers, yielding more robust extrapolations. Using holdout validation we have shown that GMM´s consistently perform comparably, or outperform, the best classifier for all training set sizes (small and large), and that this is not the case with CT´s and SVM´s. We also show that GMM´s are more certain about their predictions over the broad-scale than the other classifiers.
Keywords :
environmental factors; environmental science computing; geophysics computing; learning (artificial intelligence); oceanography; pattern classification; GMM classifier; Gaussian mixture models; correlation extrapolation; correlation learning; generative classifiers; ground truth data; habitat class correlations; habitat class distribution; marine park conservation; marine park planning; robust broad scale benthic habitat mapping; seabed ground truth sampling; seabed morphology; training data; Biological system modeling; Correlation; Data models; Entropy; Support vector machines; Training; Training data;
Conference_Titel :
OCEANS, 2012 - Yeosu
Conference_Location :
Yeosu
Print_ISBN :
978-1-4577-2089-5
DOI :
10.1109/OCEANS-Yeosu.2012.6263540