IPython/Jupyter Extensions
The pyrasterframes.rf_ipython
module injects a number of visualization extensions into the IPython environment, enhancing visualization of DataFrame
s and Tile
s.
By default, the last expression’s result in a IPython cell is passed to the IPython.display.display
function. This function in turn looks for a DisplayFormatter
associated with the type, which in turn converts the instance to a display-appropriate representation, based on MIME type. For example, each DisplayFormatter
may plain/text
version for the IPython shell, and a text/html
version for a Jupyter Notebook.
This will be our setup for the following examples:
from pyrasterframes import *
from pyrasterframes.rasterfunctions import *
from pyrasterframes.utils import create_rf_spark_session
import pyrasterframes.rf_ipython
from IPython.display import display
import os.path
spark = create_rf_spark_session()
def scene(band):
b = str(band).zfill(2) # converts int 2 to '02'
return 'https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/' \
'MCD43A4.A2019059.h11v08.006.2019072203257_B{}.TIF'.format(b)
rf = spark.read.raster(scene(2), tile_dimensions=(256, 256))
Tile Samples
We have some convenience methods to quickly visualize tiles (see discussion of the RasterFrame schema for orientation to the concept) when inspecting a subset of the data in a Notebook.
In an IPython or Jupyter interpreter, a Tile
object will be displayed as an image with limited metadata.
sample_tile = rf.select(rf_tile('proj_raster').alias('tile')).first()['tile']
sample_tile # or `display(sample_tile)`
DataFrame Samples
Within an IPython or Jupyter interpreter, a Spark and Pandas DataFrames containing a column of tiles will be rendered as the samples discussed above. Simply import the rf_ipython
submodule to enable enhanced HTML rendering of these DataFrame types.
rf # or `display(rf)`, or `rf.display()`
Showing only top 5 rows.
Changing Number of Rows
By default the RasterFrame sample display renders 5 rows. Because the IPython.display.display
function doesn’t pass parameters to the underlying rendering functions, we have to provide a different means of passing parameters to the rendering code. Pandas approach to this is to use global settings via set_option
/get_option
. We take a more functional approach and have the user invoke an explicit display
method:
rf.display(num_rows=1, truncate=True)
Showing only top 1 rows.
proj_raster_path | proj_raster |
---|---|
https://modis-pds.s3.amazonaws.com/MCD43... |
Pandas
There is similar rendering support injected into the Pandas by the rf_ipython
module, for Pandas Dataframes having Tiles in them:
# Limit copy of data from Spark to a few tiles.
pandas_df = rf.select(rf_tile('proj_raster'), rf_extent('proj_raster')).limit(4).toPandas()
pandas_df # or `display(pandas_df)`
rf_tile(proj_raster) | rf_extent(proj_raster) | |
---|---|---|
0 | (-7072005.3050801195, 993342.4642358534, -6953397.249648972, 1111950.519667) | |
1 | (-7546437.526804707, 163086.07621782666, -7427829.47137356, 281694.1316489733) | |
2 | (-6834789.194217826, 281694.1316489733, -6716181.138786679, 400302.18708011997) | |
3 | (-7427829.47137356, 163086.07621782666, -7309221.415942413, 281694.1316489733) |
Sample Colorization
RasterFrames uses the “Viridis” color ramp as the default color profile for tile column. There are other options for reasoning about how color should be applied in the results.
Color Composite
As shown in Writing Raster Data section section, composites can be constructed for visualization:
from IPython.display import Image # For telling IPython how to interpret the PNG byte array
# Select red, green, and blue, respectively
three_band_rf = spark.read.raster(source=[[scene(1), scene(4), scene(3)]])
composite_rf = three_band_rf.withColumn('png',
rf_render_png('proj_raster_0', 'proj_raster_1', 'proj_raster_2'))
png_bytes = composite_rf.select('png').first()['png']
Image(png_bytes)
Custom Color Ramp
You can also apply a different color ramp to a single-channel Tile using the rf_render_color_ramp_png
function. See the function documentation for information about the available color maps.
rf.select(rf_render_color_ramp_png('proj_raster', 'Magma'))
Showing only top 5 rows.
rf_render_png(proj_raster) |
---|