IPython/Jupyter Extensions

The pyrasterframes.rf_ipython module injects a number of visualization extensions into the IPython environment, enhancing visualization of DataFrames and Tiles.

By default, the last expression’s result in a IPython cell is passed to the IPython.display.display function. This function in turn looks for a DisplayFormatter associated with the type, which in turn converts the instance to a display-appropriate representation, based on MIME type. For example, each DisplayFormatter may plain/text version for the IPython shell, and a text/html version for a Jupyter Notebook.

This will be our setup for the following examples:

from pyrasterframes import *
from pyrasterframes.rasterfunctions import *
from pyrasterframes.utils import create_rf_spark_session
import pyrasterframes.rf_ipython
from IPython.display import display
import os.path
spark = create_rf_spark_session()
def scene(band):
    b = str(band).zfill(2) # converts int 2 to '02'
    return 'https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/' \
             'MCD43A4.A2019059.h11v08.006.2019072203257_B{}.TIF'.format(b)
rf = spark.read.raster(scene(2), tile_dimensions=(256, 256))

Tile Samples

We have some convenience methods to quickly visualize tiles (see discussion of the RasterFrame schema for orientation to the concept) when inspecting a subset of the data in a Notebook.

In an IPython or Jupyter interpreter, a Tile object will be displayed as an image with limited metadata.

sample_tile = rf.select(rf_tile('proj_raster').alias('tile')).first()['tile']
sample_tile # or `display(sample_tile)`

DataFrame Samples

Within an IPython or Jupyter interpreter, a Spark and Pandas DataFrames containing a column of tiles will be rendered as the samples discussed above. Simply import the rf_ipython submodule to enable enhanced HTML rendering of these DataFrame types.

rf # or `display(rf)`, or `rf.display()`

Showing only top 5 rows.

proj_raster_path proj_raster
https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF
https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF
https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF
https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF
https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF

Changing Number of Rows

By default the RasterFrame sample display renders 5 rows. Because the IPython.display.display function doesn’t pass parameters to the underlying rendering functions, we have to provide a different means of passing parameters to the rendering code. Pandas approach to this is to use global settings via set_option/get_option. We take a more functional approach and have the user invoke an explicit display method:

rf.display(num_rows=1, truncate=True)

Showing only top 1 rows.

proj_raster_path proj_raster
https://modis-pds.s3.amazonaws.com/MCD43...

Pandas

There is similar rendering support injected into the Pandas by the rf_ipython module, for Pandas Dataframes having Tiles in them:

# Limit copy of data from Spark to a few tiles.
pandas_df = rf.select(rf_tile('proj_raster'), rf_extent('proj_raster')).limit(4).toPandas()
pandas_df # or `display(pandas_df)`
rf_tile(proj_raster) rf_extent(proj_raster)
0 (-7072005.3050801195, 993342.4642358534, -6953397.249648972, 1111950.519667)
1 (-7546437.526804707, 163086.07621782666, -7427829.47137356, 281694.1316489733)
2 (-6834789.194217826, 281694.1316489733, -6716181.138786679, 400302.18708011997)
3 (-7427829.47137356, 163086.07621782666, -7309221.415942413, 281694.1316489733)

Sample Colorization

RasterFrames uses the “Viridis” color ramp as the default color profile for tile column. There are other options for reasoning about how color should be applied in the results.

Color Composite

As shown in Writing Raster Data section section, composites can be constructed for visualization:

from IPython.display import Image # For telling IPython how to interpret the PNG byte array
# Select red, green, and blue, respectively
three_band_rf = spark.read.raster(source=[[scene(1), scene(4), scene(3)]])
composite_rf = three_band_rf.withColumn('png',
                    rf_render_png('proj_raster_0', 'proj_raster_1', 'proj_raster_2'))
png_bytes = composite_rf.select('png').first()['png'] 
Image(png_bytes)

Custom Color Ramp

You can also apply a different color ramp to a single-channel Tile using the rf_render_color_ramp_png function. See the function documentation for information about the available color maps.

rf.select(rf_render_color_ramp_png('proj_raster', 'Magma'))

Showing only top 5 rows.

rf_render_png(proj_raster)