Release Notes
0.9.x
0.9.1
- Upgraded to Spark 2.4.7
- Added
pyspark.sql.DataFrame.display(num_rows:int, truncate:bool)
extension method whenrf_ipython
is imported. - Added users’ manual section on IPython display enhancements.
- Added
method_name
parameter to therf_resample
method. - BREAKING: In SQL, the function
rf_resample
now takes 3 arguments. You can userf_resample_nearest
with two arguments or refactor torf_resample(t, v, "nearest")
. - Added resample method parameter to SQL and Python APIs. See updated docs.
- Upgraded many of the pyrasterframes dependencies, including:
descartes
,fiona
,folium
,geopandas
,matplotlib
,numpy
,pandas
,rasterio
,shapely
- Changed
rasterframes.prefer-gdal
configuration parameter to default toFalse
, as JVM GeoTIFF performs just as well for COGs as the GDAL one. - Fixed #545.
0.9.0
- Upgraded to GeoTrellis 3.3.0. This includes a number of breaking changes enumerated as a part of the PR’s change log. These include:
- Add
Int
type parameter toGrid
- Add
Int
type parameter toCellGrid
- Add
Int
type parameter toGridBounds
… orTileBounds
- Use
GridBounds.toGridType
to coerce fromInt
toLong
type parameter - Update imports for layers, particularly
geotrellis.spark.tiling
togeotrellis.layer
- Update imports for
geotrellis.spark.io
togeotrellis.spark.store...
- Removed
FixedRasterExtent
- Removed
FixedDelegatingTile
- Removed
org.locationtech.rasterframes.util.Shims
- Change
Extent.jtsGeom
toExtent.toPolygon
- Change
TileLayerMetadata.gridBounds
toTileLayerMetadata.tileBounds
- Add
geotrellis-gdal
dependency - Remove any conversions between JTS geometry and old
geotrellis.vector
geometry - Changed
org.locationtech.rasterframes.encoders.StandardEncoders.crsEncoder
tocrsSparkEncoder
- Change
(cols, rows)
dimension destructuring toDimensions(cols, rows)
- Revisit use of
Tile
equality since it’s more strict - Update
reference.conf
to usegeotrellis.raster.gdal
namespace. - Replace all uses of
TileDimensions
withgeotrellis.raster.Dimensions[Int]
. - Upgraded to
gdal-warp-bindings
1.0.0. - Upgraded to Spark 2.4.5
- Formally abandoned support for Python 2. Python 2 is dead. Long live Python 2.
- Introduction of type hints in Python API.
- Add functions for changing cell values based on either conditions or to achieve a distribution of values. (#449)
- Add
rf_local_min
,rf_local_max
, andrf_local_clip
functions. - Add cell value scaling functions
rf_rescale
andrf_standardize
. - Add
rf_where
function, similar in spirit to numpy’swhere
, or a cell-wise version of Spark SQL’swhen
andotherwise
.
- Add
- Add
rf_sqrt
function to compute cell-wise square root.
0.8.x
0.8.5
- Added
rf_z2_index
for constructing a Z2 index on types with bounds. - Breaking:
rf_spatial_index
renamedrf_xz2_index
to differentiate between XZ2 and Z2 variants. - Added
withSpatialIndex
to RasterSourceDataSource to pre-partition tiles based on tile extents mapped to a Z2 space-filling curve - Add
rf_mask_by_bit
,rf_mask_by_bits
andrf_local_extract_bits
to deal with bit packed quality masks. Updated the masking documentation to demonstrate the use of these functions. - Added
toDF
extension method toMultibandGeoTiff
- Added
rf_agg_extent
andrf_agg_reprojected_extent
to compute the aggregate extent of a column - Added
rf_proj_raster
for constructing aproj_raster
structure from individual CRS, Extent, and Tile columns. - Added
rf_render_color_ramp_png
to compute PNG byte array for a single tile column, with specified color ramp. - In
rf_ipython
, improved rendering of dataframe binary contents with PNG preamble. - Throw an
IllegalArgumentException
when attempting to apply a mask to aTile
whoseCellType
has no NoData defined. (#409) - Add
rf_agg_approx_quantiles
function to compute cell quantiles across an entire column.
0.8.4
- Upgraded to Spark 2.4.4
- Add
rf_mask_by_values
andrf_local_is_in
raster functions; added optionalinverse
argument torf_mask
functions. (#403, #384) - Added forced truncation of WKT types in Markdown/HTML rendering. (#408)
- Add
rf_local_is_in
raster function. (#400) - Added partitioning to catalogs before processing in RasterSourceDataSource (#397)
- Fixed bug where
rf_tile_dimensions
would cause unnecessary reading of tiles. (#394) - Breaking (potentially): removed
GeoTiffCollectionRelation
due to usage limitation and overlap withRasterSourceDataSource
functionality.
0.8.3
- Updated to GeoTrellis 2.3.3 and Proj4j 1.1.0.
- Fixed issues with
LazyLogger
and shading assemblies (#293) - Updated
rf_crs
to accept string columns containing CRS specifications. (#366) - Added
rf_spatial_index
function. (#368) - Breaking (potentially): removed
pyrasterframes.create_spark_session
in lieu ofpyrasterframes.utils.create_rf_spark_session
0.8.2
- Added ability to pass config options to convenience PySpark session constructor. (#361)
- Bumped Spark dependency to version 2.3.4. (#350)
- Fixed handling of aggregate extent and image size on GeoTIFF writing. (#362)
- Fixed issue with
RasterSourceDataSource
swallowing exceptions. (#267) - Fixed SparkML memory pressure issue caused by unnecessary reevaluation, overallocation, and primitive boxing. (#343)
- Fixed Parquet serialization issue with
RasterRef
s (#338) - Fixed
TileExploder
,rf_agg_local_mean
andTileColumnSupport
to supportproj_raster
struct (#287, #163, #333). - Various documentation improvements.
- Breaking (potentially): Synchronized parameter naming in Python and Scala for
spark.read.raster
(#329).
0.8.1
- Added
rf_local_no_data
,rf_local_data
andrf_interpret_cell_type_as
raster functions. - Added:
rf_rgb_composite
andrf_render_png
. - Added
toMarkdown
andtoHTML
extension methods forDataFrame
, and registered them with the IPython formatter system whenrf_ipython
is imported. - New documentation theme (thanks @jonas!).
- Fixed: Removed false return type guarantee in cases where an
Expression
accepts eitherTile
orProjectedRasterTile
(#295)
0.8.0
- Super-duper new Python-centric RasterFrames Users’ Manual!
- Upgraded to the following core dependencies: Spark 2.3.3, GeoTrellis 2.3.0, GeoMesa 2.2.1, JTS 1.16.0.
- Build
pyrasterframes
binary distribution for pip installation. - Added support for rendering RasterFrame types in IPython/Jupyter.
- Added new tile functions
rf_round
,rf_abs
,rf_log
,rf_log10
,rf_log2
,rf_log1p
,rf_exp
,rf_exp10
,rf_exp2
,rf_expm1
,rf_resample
. - Support Python-side Tile User-Defined Type backed by numpy
ndarray
orma.MaskedArray
. - Support Python-side Shapely geometry User-Defined Type.
- SQL API support for
rf_assemble_tile
andrf_array_to_tile
. - Introduced at the source level the concept of a
RasterSource
andRasterRef
, enabling lazy/delayed read of sub-scene tiles. - Added
withKryoSerialization
extension methods onSparkSession.Builder
andSparkConf
. - Added
rf_render_matrix
debugging function. - Added
RasterFrameLayer.withExtent
extension method. - Added
SinglebandGeoTiff.toDF
extension method. - Added
DataFrame.rasterJoin
extension method for merging two dataframes with tiles in disparate CRSs. - Added
rf_crs
forProjectedRasterTile
columns. - Added
st_extent
(forGeometry
types) andrf_extent
(forProjectedRasterTile
andRasterSource
columns). - Added
st_geometry
(forExtent
types) andrf_geometry
(forProjectedRasterTile
andRasterSource
columns). - Reworked build scripts for RasterFrames Jupyter Notebook.
- Breaking: The type
RasterFrame
renamedRasterFrameLayer
to be reflect its intended purpose. - Breaking: All
asRF
methods renamed toasLayer
. - Breaking: Root package changed from
org.locationtech.rasterframes
toorg.locationtech.rasterframes
. - Breaking: Removed
envelope
, in lieu ofst_extent
,rf_extent
orst_envelope
- Breaking: Renamed
rf_extent_geometry
tost_geometry
- Breaking: Renamed
rf_tile_dimensions
torf_dimensions
- Breaking: Renamed
rf_reproject_geometry
tost_reproject
- Breaking: With the upgrade to JTS 1.16.0, all imports of
com.vividsolutions.jts
need to be changed toorg.locationtech.jts
. - Deprecation: Tile column functions (in
RasterFunctions
) and SQL registered names have all been renamed to followsnake_case
conventions, with anrf_
prefix, matching SQL and Python. A temporary compatibility shim is included so that code built against 0.7.1 and earlier still work. These will be marked as deprecated. - Breaking: In Scala and SQL,
..._scalar
functions (e.g.local_add_scalar
) have been removed. Non-scalar forms now dynamically detect type of right hand side. - Breaking:
tileToArray
has been replaced with_tile_to_array_double
and_tile_to_array_int
. - Breaking: Renamed
bounds_geometry
torf_extent_geometry
. - Breaking: renamed
agg_histogram
torf_agg_approx_histogram
,local_agg_stats
torf_agg_local_stats
,local_agg_max
torf_agg_local_max
,local_agg_min
torf_agg_local_min
,local_agg_mean
torf_agg_local_mean
,local_agg_data_cells
torf_agg_local_data_cells
,local_agg_no_data_cells
torf_agg_local_no_data_cells
. - Breaking:
CellHistogram
no longer carries along approximate statistics, due to confusing behavior. Userf_agg_stats
instead. - Introduced
LocalCellStatistics
class to wrap together results fromLocalStatsAggregate
. - Breaking:
TileDimensions
moved fromastraea.spark.rasterframes
toorg.locationtech.rasterframes.model
. - Breaking: Renamed
RasterFrame.withBounds
toRasterFrameLayer.withGeometry
for consistency with DataSource schemas.
Known issues
- #188: Error on deserialization of a
Tile
with abool
cell type to the Python side; see issue description for work around.
0.7.x
0.7.1
- Fixed ColorRamp pipeline in MultibandRender
- Fixed Python wrapper for
explodeTiles
0.7.0
- Now an incubating project under Eclipse Foundation LocationTech! GitHub repo moved to locationtech/rasterframes.
- PySpark support! See
pyrasterframes/python/README.rst
to get started. - Exposed Spark JTS spatial operations in Python.
- Added RasterFrames-enabled Jupyter Notebook Docker Container package. See
deployment/README.md
for details. - Updated to GeoMesa version 2.0.1.
- Added
convertCellType
,normalizedDifference
mask
andinverseMask
operations on tile columns. - Added tile column + scalar operations:
localAddScalar
,localSubtractScalar
,localMultiplyScalar
,localDivideScalar
- Added
rasterize
andreprojectGeometry
operations on geometry columns. - Added for for writing GeoTIFFs from RasterFrames via
DataFrameWriter
. - Added
spark.read.geotrellis.withNumPartitions(Int)
for setting the initial number of partitions to use when reading a layer. - Added
spark.read.geotrellis.withTileSubdivisions(Int)
for evenly subdividing tiles before they become rows in a RasterFrame. - Added
experimental
package for sandboxing new feature ideas. - Added experimental GeoJSON DataSource with schema inferfence on feature properties.
- Added Scala, SQL, and Python tile-scalar arithmetic operations:
localAddScalar
,localSubtractScalar
,localMultipyScalar
,localDivideScalar
. - Added Scala, SQL, and Python tile functions for logical comparisons both tile-tile and tile-scalar variants:
localLess
,localLessEqual
,localGreater
,localGreaterEqual
,localEqual
, andlocalUnequal
. - Added
SlippyExport
experimental feature for exporting the contents of a RasterFrame as a SlippyMap tile image directory structure and Leaflet/OpenMaps-enabled HTML file. - Added experimental DataSource implementations for MODIS and Landsat 8 catalogs on AWS PDS.
- Change: Default interpoation for
toRaster
andtoMultibandRaster
has been changed fromBilinear
toNearestNeighbor
. - Breaking: Renamed/moved
astraea.spark.rasterframes.functions.CellStatsAggregateFunction.Statistics
toastraea.spark.rasterframes.stats.CellStatistics
. - Breaking:
HistogramAggregateFunction
now generates the new typeastraea.spark.rasterframes.stats.CellHistogram
. - Breaking:
box2D
renamedenvelope
.
0.6.x
0.6.1
- Added support for reading striped GeoTiffs (#64).
- Moved extension methods associated with querying tagged columns to
DataFrameMethods
for supporting temporal and spatial columns on non-RasterFrame DataFrames. - GeoTIFF and GeoTrellis DataSources automatically initialize RasterFrames.
- Added
RasterFrame.toMultibandRaster
. - Added utility for rendering multiband tile as RGB composite PNG.
- Added
RasterFrame.withRFColumnRenamed
to lessen boilerplate in maintainingRasterFrame
type tag.
0.6.0
- Upgraded to Spark 2.2.1. Added
VersionShims
to allow for Spark 2.1.x backwards compatibility. - Introduced separate
rasterframes-datasource
library for hosting sources from which to read RasterFrames. - Implemented basic (but sufficient) temporal and spatial filter predicate push-down feature for the GeoTrellis layer datasource.
- Added Catalyst expressions specifically for spatial relations, allowing for some polymorphism over JTS types.
- Added a GeoTrellis Catalog
DataSource
for inspecting available layers and associated metadata at a URI - Added GeoTrellis Layer DataSource for reading GeoTrellis layers from any SPI-registered GeoTrellis backend (which includes HDFS, S3, Accumulo, HBase, Cassandra, etc.).
- Ability to save a RasterFrame as a GeoTrellis layer to any SPI-registered GeoTrellis backends. Multi-column RasterFrames are written as Multiband tiles.
- Addd a GeoTiff DataSource for directly loading a (preferably Cloud Optimized) GeoTiff as a RasterFrame, each row containing tiles as they are internally organized.
- Fleshed out support for
MultibandTile
andTileFeature
support in datasource. - Added typeclass for specifying merge operations on
TileFeature
data payload. - Added
withTemporalComponent
convenince method for creating appending a temporal key column with constant value. - Breaking: Renamed
withExtent
towithBounds
, and now returns a JTSPolygon
. - Added
EnvelopeEncoder
for encoding JTSEnvelope
type. - Refactored build into separate
core
anddocs
, paving way forpyrasterframes
polyglot module. - Added utility extension method
withPrefixedColumnNames
toDataFrame
.
Known Issues
- Writing multi-column RasterFrames to GeoTrellis layers requires all tiles to be of the same cell type.
0.5.x
0.5.12
- Added
withSpatialIndex
to introduce a column assigning a z-curve index value based on the tile’s centroid in EPSG:4326. - Added column-appending convenience methods:
withExtent
,withCenter
,withCenterLatLng
- Documented example of creating a GeoTrellis layer from a RasterFrame.
- Added Spark 2.2.0 forward-compatibility.
- Upgraded to GeoTrellis 1.2.0-RC2.
0.5.11
- Significant performance improvement in
explodeTiles
(1-2 orders of magnitude). See #38 - Fixed bugs in
NoData
handling when converting toDouble
tiles.
0.5.10
0.5.9
- Ported to sbt 1.0.3
- Added sbt-generated
astraea.spark.rasterframes.RFBuildInfo
- Fixed bug in computing
aggMean
when one or more tiles arenull
- Deprecated
rfIinit
in favor ofSparkSession.withRasterFrames
orSQLContext.withRasterFrames
extension methods
0.5.8
- Upgraded to GeoTrellis 1.2.0-RC1
- Added
REPLsent
-based tour of RasterFrames - Moved Giter8 template to separate repository
s22s/raster-frames.g8
due to sbt limitations - Updated Getting Started to reference new Giter8 repo
- Changed SQL function name
rf_stats
andrf_histogram
torf_aggStats
andrf_aggHistogram
for consistency with DataFrames API
0.5.7
- Created faster implementation of aggregate statistics.
- Fixed bug in deserialization of
TileUDT
s originating fromConstantTile
s - Fixed bug in serialization of
NoDataFilter
within SparkML pipeline - Refactoring of UDF organization
- Various documentation tweaks and updates
- Added Giter8 template
0.5.6
TileUDF
s are encoded using directly into Catalyst–without Kryo–resulting in an insane decrease in serialization time for small tiles (int8
, <= 128²), and pretty awesome speedup for all other cell types other thanfloat32
(marginal slowing). While not measured, memory footprint is expected to have gone down.
0.5.5
aggStats
andtileMean
functions rewritten to compute simple statistics directly rather than usingStreamingHistogram
tileHistogramDouble
andtileStatsDouble
were replaced bytileHistogram
andtileStats
- Added
tileSum
,tileMin
andtileMax
functions - Added
aggMean
,aggDataCells
andaggNoDataCells
aggregate functions. - Added
localAggDataCells
andlocalAggNoDataCells
cell-local (tile generating) fuctions - Added
tileToArray
andarrayToTile
- Overflow fix in
LocalStatsAggregateFunction
0.9.1