Pseudo tiling of rasters

I am currently working on a project that involves machine learning to predict of ecological attributes across landscapes at fine spatial resolution. Our current list is consists of 80 covariates stacked together. To make predictions across the landscape significant tiling of the raster data is needed. However, the tiling process itself takes significant time and creates a tiled dataset in the thousands of tiles… There is another way!

Pseudo-tiles effectively run tiling opperations without having to generate numerous tile raster files1

Below I use two packages to create pseudo-tiles of the data.

  • GSIF::getSpatialTiles is used to generate a polygon layer of tiles covering the area of interest.
  • STARS::read_stars can be used to load only a specific portion of each raster in the dataset.
library(tidyverse)
library(sf)
library(rgdal)
library(stars)
library(GSIF)
library(tmap)

Step 1: Define the tiles

Locate the data

I have previously confirmed that all of these layers stack (i.e. raster::stack(cv_list))

## list of rasters 
  cv_list <- list.files("e:/tmpGIS/PEM_cvs/", pattern = "*.tif",full.names = TRUE)
  # cv <- list.files("c:/tmp", pattern = "*.tif", full.names = TRUE)
  cv_list <- cv_list[-(grep(cv_list, pattern = "xml"))] ## drop any associated xml files 
  print(paste("there are", length(cv_list), "covariates in the list"))
## [1] "there are 80 covariates in the list"

This next section is an adaptation of code provided by Tom Hengl.

## make tiles 
  # GSIF requires a GDAL object
  cv1 <- GDALinfo(cv_list[1]) ## just loads the header info 
  p_tiles <- GSIF::getSpatialTiles(cv1, block.x = 1000, return.SpatialPolygons = TRUE)
  p_tiles <- st_as_sf(p_tiles) %>% mutate(id = row_number())
  # Create tabular description of tiles -- generates a table of the extents of each tile 
  t_tiles  <- GSIF::getSpatialTiles(cv1, block.x = 1000, return.SpatialPolygons = FALSE) %>%
    mutate(id = row_number())
  
  p_tiles <- left_join(p_tiles, t_tiles)

The attribute data provides all the information needed to generate a pseudo-tile.

head(as.data.frame(p_tiles) %>% dplyr::select(-geometry))
##   id     xl      yl     xu      yu offset.y offset.x region.dim.y region.dim.x
## 1  1 552600 5983800 553600 5984800     4820        0          400          400
## 2  2 553600 5983800 554600 5984800     4820      400          400          400
## 3  3 554600 5983800 555600 5984800     4820      800          400          400
## 4  4 555600 5983800 556600 5984800     4820     1200          400          400
## 5  5 556600 5983800 557600 5984800     4820     1600          400          400
## 6  6 557600 5983800 558600 5984800     4820     2000          400          400

Step 2: Load Pseudo-tile

So other then defining what the tiles should be no tiles have actually been created. Now with the use of the stars package a tile can be loaded by specifying the pixel locations within the raster. Below tile number 167 is loaded

r <- read_stars(cv_list, 
                RasterIO = list(nXOff  = t_tiles[167, "offset.x"],
                                nYOff  = t_tiles[167, "offset.y"],
                                nXSize = t_tiles[167, "region.dim.x"],
                                nYSize = t_tiles[167, "region.dim.y"]))
plot(r[1])
A tile extracted from a list of 80 covariate rasters

Figure 1: A tile extracted from a list of 80 covariate rasters

Next steps

Modeling

For modeling a stars object can be simply converted to a data.frame, be passed to a model, and then returned to the stars object. Hmmmn … not sure these steps are even needed … might be able to pass the object directly to a model. If not then the below will work.

df.r <- as.data.frame  ## convert to data.frame
df.r$uniform <- runif(nrow(df.r))  ## create a new attribute 
r$uniform <- df.r$uniform  ## assign the attribute back to the stars object. 

Build function

done!

I added tile_index() to the PEMgeneratr package. It takes a raster, and produces tiles of a specified pixel count (x by y). Here is an example:

t <- PEMgeneratr::tile_index(cv_list[1], 1000)

map <- tm_shape(t) + tm_polygons(alpha = 0.25)
tmap_leaflet(map)

  1. my definition.

Colin Chisholm RPF
Colin Chisholm RPF
Forest Manager

Interested in forests and ecology