Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/137273
Citations
Scopus Web of Science® Altmetric
?
?
Type: Journal article
Title: Spatial Random Forest (S-RF): A random forest approach for spatially interpolating missing land-cover data with multiple classes
Author: Holloway-Brown, J.
Helmstedt, K.J.
Mengersen, K.L.
Citation: International Journal of Remote Sensing, 2021; 42(10):3756-3776
Publisher: Informa UK Limited
Issue Date: 2021
ISSN: 0143-1161
1366-5901
Statement of
Responsibility: 
Jacinta Holloway-Brown, Kate J. Helmstedt, and Kerrie L. Mengersen
Abstract: Land-cover maps are important tools for monitoring large-scale environmental change and can be regularly updated using free satellite imagery data. A key challenge with constructing these maps is missing data in the satellite images on which they are based. To address this challenge, we created a Spatial Random Forest (S-RF) model that can accurately interpolate missing data in satellite images based on a modest training set of observed data in the image of interest. We demonstrate that this approach can be effective with only a minimal number of spatial covariates, namely latitude and longitude. The motivation for only using latitude and longitude in our model is that these covariates are available for all images whether the data are observed or missing due to cloud cover. The S-RF model can flexibly partition these covariates to provide accurate estimates, with easy incorporation of additional covariates to improve estimation if available. The effectiveness of our approach has been previously demonstrated for prediction of two land-cover classes in an Australian case study. In this paper, we extend the method to more than two classes. We demonstrate the performance of the S-RF method at interpolating multiple landcover classes, using a case study drawn from South America. The results show that the method is best at predicting three land-cover classes, compared with 5 or 10 classes, and that other information is needed to improve performance as the number of classes grows, particularly if the classes are unbalanced. We explore two issues through a sensitivity analysis: the influence of the amount of missing data in the image and the influence of the amount of training data for model development and performance. The results show that the amount of missing data due to cloud cover is influential on model performance for multiple classes. We also found that increasing the amount of training data beyond 100,000 observations had minimal impact on model accuracy. Hence, a relatively small amount of observed data is required for training the model, which is beneficial for computation time.
Rights: © 2021 Informa UK Limited, trading as Taylor & Francis Group
DOI: 10.1080/01431161.2021.1881183
Grant ID: http://purl.org/au-research/grants/arc/CE140100049
http://purl.org/au-research/grants/arc/DE200101791
Published version: http://dx.doi.org/10.1080/01431161.2021.1881183
Appears in Collections:Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.