wpRFPMS: Getting started

Suggested citation


Bondarenko M. , Nieves J. J., Stevens F. R., Gaughan A. E., Tatem A. and Sorichetta A. 2020. wpgpRFPMS: Random Forests population modelling R scripts, version 0.1.0. University of Southampton: Southampton, UK. 10.5258/SOTON/WP00665

The wpRFPMS script calls upon multiple R scripts as “modules” and all of the functions are stored in separate R files. In order to start using the wpRFPMS script you have to download all files from github to your local directory. You will be expected to have the following file structure:

Screen capture of wpRFPMS file structure

input.R is an input file for the main program containing the following user defined input parameters:

Name Description
rfg.input.countries Declare the 3-letter ISO code(s) of the country(ies) you are interested in modelling. NOTE: You must declare the ISO codes of the countries you are modelling even if you plan on only modeling portions of them, i.e. declaring specific admin IDs below or using a shapefile to subset them.
rfg.input.poptables If you are using specific Population tables, i.e. non-standard, stored locally, declare their paths here. Otherwise, the script will source the ones from the WorldPop FTP.
rfg.input.adminids Declare specific admin IDS by which to subset the above declared countries. WARNING: You can NOT use this option in conjunction with the shapefile subsetting option. At least one of the two subsetting options MUST be set to NULL
rfg.input.shp Declare the paths to the shapefiles subsetting the countries of interest which were declared above. WARNING: You can NOT use this option in conjunction with the adminID subsetting option. At least one of the two subsetting options MUST be set to NULL.
rfg.input.year Declare the input year for which we are modelling. This must be declared as a numeric character string, e.g. "2000"
rfg.input.cvr Declare a list of the character representations of the covariates with which we intend to do modelling with. NOTE: You can use the function wpgpListCountryCovariates() from the wpgpCovariates library to see what all covariates are available, but most covariates declared by the user will remain the same between covariate runs excluding the year specific part of their name. EXAMPLE: wpgpListCountryCovariates(ISO3="NPL", username = "", password = "")
rfg.fixed.set If TRUE we are using a fixed set in this modeling, i.e. are we parameterizing, in part or in full, this RF model run upon another country's(ies') RF model object. If rfg.fixed.set.incl.input.countries=TRUE then countries from rfg.input.countries will be combined with the RF object from files available in the following directory: /data/old_popfits/popfits_final and popfits_quant

Example of input.R file is below:

rfg.input.countries <- c("BTN")
rfg.input.poptables <- NULL
rfg.input.adminids <- NULL
rfg.input.shp <- NULL
rfg.input.year <- "2000"
rfg.input.cvr <- list("ccilc_dst011_2000",
                        "ccilc_dst040_2000",
                        "ccilc_dst130_2000",
                        "ccilc_dst140_2000",
                        "ccilc_dst150_2000",
                        "ccilc_dst160_2000",
                        "ccilc_dst190_2000",
                        "ccilc_dst200_2000",
                        "cciwat_dst",
                        "ghsl_esa_dst_2000",
                        "osmint_dst",
                        "osmriv_dst",
                        "osmroa_dst",
                        "wclim_prec",
                        "wclim_temp",
                        "slope",
                        "topo",
                        "osmroa_dst",  
                        "osmriv_dst",
                        "gpw4coast_dst"
                       ) 
rfg.fixed.set <- FALSE
rfg.fixed.set.incl.input.countries <- FALSE
rfg.fixed.set.description <- ""                     
                

config.R is a configuration file for the program containing the information for WorldPop FTP and the paths for python and gdal:

Example for UNIX:

rfg.gdal_path <- paste0("/usr/bin/")
rfg.gdal_gdalwarp_path <- paste0(rfg.gdal_path,"gdalwarp")
rfg.gdal_merge_path <- paste0(rfg.gdal_path,"gdal_merge.py")
rfg.gdal_calc_path <- paste0(rfg.gdal_path,"gdal_calc.py")
rfg.gdal_polygonize_path <- paste0(rfg.gdal_path,"gdal_polygonize.py")

Example for Windows:

rfg.gdal_path <- paste("\"C:\\Program Files (x86)\\GDAL\\")
rfg.gdal_gdalwarp_path <- paste0(rfg.gdal_path,"gdalwarp.exe\"")
rfg.gdal_merge_path <- paste0(rfg.gdal_path,"gdal_merge.py\"")
rfg.gdal_calc_path <- paste0(rfg.gdal_path,"gdal_calc.py\"")
rfg.gdal_polygonize_path <- paste0(rfg.gdal_path,"gdal_polygonize.py\"")

config.R file has other important parameters which need to be changed before using script.

Name Description
rfg.cluster_workers How many cores to use during colculation. You can limit the number of cores used in the program by changing rfg.cluster_workers <- 5 otherwise max cores on PC minus one will be used.
rfg.minblocks How many blocks to use during prediction and other functions. If NULL the script will find the oprimal number of blocks based on memory and cores available on your PC.
rfg.ftp.username Username for WP FTP.
rfg.ftp.password Password for WP FTP.

In order to run the wpRFPMS script root_path needs to be changed to the directory where the script was copied.