• DocumentCode
    249607
  • Title

    Using ensemble margin to explore issues of training data imbalance and mislabeling on large area land cover classification

  • Author

    Mellor, Andrew ; Boukir, Samia ; Haywood, Andrew ; Jones, Simon

  • Author_Institution
    Sch. of Math. & Geospatial Sci., RMIT Univ., Melbourne, VIC, Australia
  • fYear
    2014
  • fDate
    27-30 Oct. 2014
  • Firstpage
    5067
  • Lastpage
    5071
  • Abstract
    This work introduces new ensemble margin criteria, to evaluate the performance of Random Forests (RF), in the context of large area land cover classification, using imbalanced and noisy training data. Experiments using binary and multiclass classification problems reveal insights into the behaviour of RF over big data, in which training data contains noise and may not be evenly distributed among classes. The margin-based RF performance evaluation is conducted using remote sensing and ancillary spatial data, across a 7.2 million hectare study area.
  • Keywords
    geographic information systems; pattern classification; remote sensing; RF; ancillary spatial data; binary classification; ensemble margin; imbalanced training data; large area land cover classification; mislabeling; multiclass classification; noisy training data; random forests; remote sensing; Accuracy; Entropy; Noise; Radio frequency; Remote sensing; Training; Training data; classification; ensemble margin; imbalance; mislabeling; remote sensing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2014 IEEE International Conference on
  • Conference_Location
    Paris
  • Type

    conf

  • DOI
    10.1109/ICIP.2014.7026026
  • Filename
    7026026