Release Notes


  • Upgraded to Spark 2.2.0 (now required)
  • Introduced separate rasterframes-datasource library for hosting sources from which to read RasterFrames.
  • Implemented basic temporal and spatial filter predicate push-down feature for the GeoTrellis layer datasource.
  • Added a GeoTrellis “Catalog” DataSource for inspecting available layers and associated metadata at a URI.
  • Addd a GeoTiff DataSource for directly loading a (preferably Cloud Optimized) GeoTiff as a RasterFrame, each row containing tiles as they are internally organized.
  • Fleshed out support for MultibandTile and TileFeature support in datasource.
  • Added withTemporalComponent convenince method for creating appending a temporal key column with constant value.



  • Added withSpatialIndex to introduce a column assigning a z-curve index value based on the tile’s centroid in EPSG:4326.
  • Added column-appending convenience methods: withExtent, withCenter, withCenterLatLng
  • Documented example of creating a GeoTrellis layer from a RasterFrame.
  • Added Spark 2.2.0 forward-compatibility
  • Upgraded to GeoTrellis 1.2.0-RC2


  • Significant performance improvement in explodeTiles (1-2 orders of magnitude). See #38
  • Fixed bugs in NoData handling when converting to Double tiles.


  • Upgraded to shapeless 2.3.2
  • Fixed #36, #37


  • Ported to sbt 1.0.3
  • Added sbt-generated astraea.spark.rasterframes.RFBuildInfo
  • Fixed bug in computing aggMean when one or more tiles are null
  • Deprecated rfIinit in favor of SparkSession.withRasterFrames or SQLContext.withRasterFrames extension methods


  • Upgraded to GeoTrellis 1.2.0-RC1
  • Added REPLsent-based tour of RasterFrames
  • Moved Giter8 template to separate repository s22s/raster-frames.g8 due to sbt limitations
  • Updated Getting Started to reference new Giter8 repo
  • Changed SQL function name rf_stats and rf_histogram to rf_aggStats and rf_aggHistogram for consistency with DataFrames API


  • Created faster implementation of aggregate statistics.
  • Fixed bug in deserialization of TileUDTs originating from ConstantTiles
  • Fixed bug in serialization of NoDataFilter within SparkML pipeline
  • Refactoring of UDF organization
  • Various documentation tweaks and updates
  • Added Giter8 template


  • TileUDFs are encoded using directly into Catalyst–without Kryo–resulting in an insane decrease in serialization time for small tiles (int8, <= 128²), and pretty awesome speedup for all other cell types other than float32 (marginal slowing). While not measured, memory footprint is expected to have gone down.


  • aggStats and tileMean functions rewritten to compute simple statistics directly rather than using StreamingHistogram
  • tileHistogramDouble and tileStatsDouble were replaced by tileHistogram and tileStats
  • Added tileSum, tileMin and tileMax functions
  • Added aggMean, aggDataCells and aggNoDataCells aggregate functions.
  • Added localAggDataCells and localAggNoDataCells cell-local (tile generating) fuctions
  • Added tileToArray and arrayToTile
  • Overflow fix in LocalStatsAggregateFunction