Loading Multi-band satellite images in tensorflow data pipelines

Viraj Kadam
3 min readSep 1, 2022

--

Using custom functions for doing some preprocessing of input data in tf data pipelines can be a daunting task, especially if you don’t know how to go about the task. But worry not, this article should help you figure out how to do that.

A significant advantage of being able to use custom functions in tf data pipelines is that you are not limited to the tensorflow development ecosystem, and hence can use any other python module function inside the pipeline.

The Data Pipeline

Our task is to build a pipeline that will take in a the directory path with 12 rasters each in that directory as input, and will output a 12-band image as a output.

We will design a function that takes in a sentinel-2 bands, and make a image of 12 bands out of it, after doing some preprocessing on each of the band.

We will use rasterio to load in the each of the raster bands, re-scale the raster by re-scaling value of 10000 , and reshape the raster so that all our bands have a common height and width.

After doing this operation on all the 12 bands, we can stack the bands together and return the multiband raster.

Lets take a look at how to build this pipeline.

A function to read in the raster band. This will return a numpy array
This function is responsible for taking all the 12 bands in the directory, and stacking them after preprocessing, returning a Multi-band raster.
We have this wrapper function around the load_s2_tiffs function to decode the folder paths. This is important as we will be working with tensorflow data pipelines.
Finally we have this function, which will do the loading and preprocessing operation for us. The arguments of tf.py_function are 1) func , 2)inp , 3)Tout (i.e output format) .

Putting it all together

The data pipeline function, which would return a dataset with preprocessed image and label pairs. we can define the preprocessing function, or the augmentation function as arguments to this function.
The key arguments to this function are the list of directory paths, a list of labels and the preprocessing function used for loading and preprocessing the rasters.

Finally , lets see if the function works as expected by checking the batch dimensions.

The shape is interpreted as (Batch_size,Height,Width , Channels)

Take a look at the complete notebook by clicking here : Flood Detection using S1 and S2 images

--

--

No responses yet