Release Notes
0.9.x
0.9.1
- Upgraded to Spark 2.4.7
- Added
pyspark.sql.DataFrame.display(num_rows:int, truncate:bool)extension method whenrf_ipythonis imported. - Added users’ manual section on IPython display enhancements.
- Added
method_nameparameter to therf_resamplemethod. - BREAKING: In SQL, the function
rf_resamplenow takes 3 arguments. You can userf_resample_nearestwith two arguments or refactor torf_resample(t, v, "nearest"). - Added resample method parameter to SQL and Python APIs. See updated docs.
- Upgraded many of the pyrasterframes dependencies, including:
descartes,fiona,folium,geopandas,matplotlib,numpy,pandas,rasterio,shapely - Changed
rasterframes.prefer-gdalconfiguration parameter to default toFalse, as JVM GeoTIFF performs just as well for COGs as the GDAL one. - Fixed #545.
0.9.0
- Upgraded to GeoTrellis 3.3.0. This includes a number of breaking changes enumerated as a part of the PR’s change log. These include:
- Add
Inttype parameter toGrid - Add
Inttype parameter toCellGrid - Add
Inttype parameter toGridBounds… orTileBounds - Use
GridBounds.toGridTypeto coerce fromInttoLongtype parameter - Update imports for layers, particularly
geotrellis.spark.tilingtogeotrellis.layer - Update imports for
geotrellis.spark.iotogeotrellis.spark.store... - Removed
FixedRasterExtent - Removed
FixedDelegatingTile - Removed
org.locationtech.rasterframes.util.Shims - Change
Extent.jtsGeomtoExtent.toPolygon - Change
TileLayerMetadata.gridBoundstoTileLayerMetadata.tileBounds - Add
geotrellis-gdaldependency - Remove any conversions between JTS geometry and old
geotrellis.vectorgeometry - Changed
org.locationtech.rasterframes.encoders.StandardEncoders.crsEncodertocrsSparkEncoder - Change
(cols, rows)dimension destructuring toDimensions(cols, rows) - Revisit use of
Tileequality since it’s more strict - Update
reference.confto usegeotrellis.raster.gdalnamespace. - Replace all uses of
TileDimensionswithgeotrellis.raster.Dimensions[Int]. - Upgraded to
gdal-warp-bindings1.0.0. - Upgraded to Spark 2.4.5
- Formally abandoned support for Python 2. Python 2 is dead. Long live Python 2.
- Introduction of type hints in Python API.
- Add functions for changing cell values based on either conditions or to achieve a distribution of values. (#449)
- Add
rf_local_min,rf_local_max, andrf_local_clipfunctions. - Add cell value scaling functions
rf_rescaleandrf_standardize. - Add
rf_wherefunction, similar in spirit to numpy’swhere, or a cell-wise version of Spark SQL’swhenandotherwise.
- Add
- Add
rf_sqrtfunction to compute cell-wise square root.
0.8.x
0.8.5
- Added
rf_z2_indexfor constructing a Z2 index on types with bounds. - Breaking:
rf_spatial_indexrenamedrf_xz2_indexto differentiate between XZ2 and Z2 variants. - Added
withSpatialIndexto RasterSourceDataSource to pre-partition tiles based on tile extents mapped to a Z2 space-filling curve - Add
rf_mask_by_bit,rf_mask_by_bitsandrf_local_extract_bitsto deal with bit packed quality masks. Updated the masking documentation to demonstrate the use of these functions. - Added
toDFextension method toMultibandGeoTiff - Added
rf_agg_extentandrf_agg_reprojected_extentto compute the aggregate extent of a column - Added
rf_proj_rasterfor constructing aproj_rasterstructure from individual CRS, Extent, and Tile columns. - Added
rf_render_color_ramp_pngto compute PNG byte array for a single tile column, with specified color ramp. - In
rf_ipython, improved rendering of dataframe binary contents with PNG preamble. - Throw an
IllegalArgumentExceptionwhen attempting to apply a mask to aTilewhoseCellTypehas no NoData defined. (#409) - Add
rf_agg_approx_quantilesfunction to compute cell quantiles across an entire column.
0.8.4
- Upgraded to Spark 2.4.4
- Add
rf_mask_by_valuesandrf_local_is_inraster functions; added optionalinverseargument torf_maskfunctions. (#403, #384) - Added forced truncation of WKT types in Markdown/HTML rendering. (#408)
- Add
rf_local_is_inraster function. (#400) - Added partitioning to catalogs before processing in RasterSourceDataSource (#397)
- Fixed bug where
rf_tile_dimensionswould cause unnecessary reading of tiles. (#394) - Breaking (potentially): removed
GeoTiffCollectionRelationdue to usage limitation and overlap withRasterSourceDataSourcefunctionality.
0.8.3
- Updated to GeoTrellis 2.3.3 and Proj4j 1.1.0.
- Fixed issues with
LazyLoggerand shading assemblies (#293) - Updated
rf_crsto accept string columns containing CRS specifications. (#366) - Added
rf_spatial_indexfunction. (#368) - Breaking (potentially): removed
pyrasterframes.create_spark_sessionin lieu ofpyrasterframes.utils.create_rf_spark_session
0.8.2
- Added ability to pass config options to convenience PySpark session constructor. (#361)
- Bumped Spark dependency to version 2.3.4. (#350)
- Fixed handling of aggregate extent and image size on GeoTIFF writing. (#362)
- Fixed issue with
RasterSourceDataSourceswallowing exceptions. (#267) - Fixed SparkML memory pressure issue caused by unnecessary reevaluation, overallocation, and primitive boxing. (#343)
- Fixed Parquet serialization issue with
RasterRefs (#338) - Fixed
TileExploder,rf_agg_local_meanandTileColumnSupportto supportproj_rasterstruct (#287, #163, #333). - Various documentation improvements.
- Breaking (potentially): Synchronized parameter naming in Python and Scala for
spark.read.raster(#329).
0.8.1
- Added
rf_local_no_data,rf_local_dataandrf_interpret_cell_type_asraster functions. - Added:
rf_rgb_compositeandrf_render_png. - Added
toMarkdownandtoHTMLextension methods forDataFrame, and registered them with the IPython formatter system whenrf_ipythonis imported. - New documentation theme (thanks @jonas!).
- Fixed: Removed false return type guarantee in cases where an
Expressionaccepts eitherTileorProjectedRasterTile(#295)
0.8.0
- Super-duper new Python-centric RasterFrames Users’ Manual!
- Upgraded to the following core dependencies: Spark 2.3.3, GeoTrellis 2.3.0, GeoMesa 2.2.1, JTS 1.16.0.
- Build
pyrasterframesbinary distribution for pip installation. - Added support for rendering RasterFrame types in IPython/Jupyter.
- Added new tile functions
rf_round,rf_abs,rf_log,rf_log10,rf_log2,rf_log1p,rf_exp,rf_exp10,rf_exp2,rf_expm1,rf_resample. - Support Python-side Tile User-Defined Type backed by numpy
ndarrayorma.MaskedArray. - Support Python-side Shapely geometry User-Defined Type.
- SQL API support for
rf_assemble_tileandrf_array_to_tile. - Introduced at the source level the concept of a
RasterSourceandRasterRef, enabling lazy/delayed read of sub-scene tiles. - Added
withKryoSerializationextension methods onSparkSession.BuilderandSparkConf. - Added
rf_render_matrixdebugging function. - Added
RasterFrameLayer.withExtentextension method. - Added
SinglebandGeoTiff.toDFextension method. - Added
DataFrame.rasterJoinextension method for merging two dataframes with tiles in disparate CRSs. - Added
rf_crsforProjectedRasterTilecolumns. - Added
st_extent(forGeometrytypes) andrf_extent(forProjectedRasterTileandRasterSourcecolumns). - Added
st_geometry(forExtenttypes) andrf_geometry(forProjectedRasterTileandRasterSourcecolumns). - Reworked build scripts for RasterFrames Jupyter Notebook.
- Breaking: The type
RasterFramerenamedRasterFrameLayerto be reflect its intended purpose. - Breaking: All
asRFmethods renamed toasLayer. - Breaking: Root package changed from
org.locationtech.rasterframestoorg.locationtech.rasterframes. - Breaking: Removed
envelope, in lieu ofst_extent,rf_extentorst_envelope - Breaking: Renamed
rf_extent_geometrytost_geometry - Breaking: Renamed
rf_tile_dimensionstorf_dimensions - Breaking: Renamed
rf_reproject_geometrytost_reproject - Breaking: With the upgrade to JTS 1.16.0, all imports of
com.vividsolutions.jtsneed to be changed toorg.locationtech.jts. - Deprecation: Tile column functions (in
RasterFunctions) and SQL registered names have all been renamed to followsnake_caseconventions, with anrf_prefix, matching SQL and Python. A temporary compatibility shim is included so that code built against 0.7.1 and earlier still work. These will be marked as deprecated. - Breaking: In Scala and SQL,
..._scalarfunctions (e.g.local_add_scalar) have been removed. Non-scalar forms now dynamically detect type of right hand side. - Breaking:
tileToArrayhas been replaced with_tile_to_array_doubleand_tile_to_array_int. - Breaking: Renamed
bounds_geometrytorf_extent_geometry. - Breaking: renamed
agg_histogramtorf_agg_approx_histogram,local_agg_statstorf_agg_local_stats,local_agg_maxtorf_agg_local_max,local_agg_mintorf_agg_local_min,local_agg_meantorf_agg_local_mean,local_agg_data_cellstorf_agg_local_data_cells,local_agg_no_data_cellstorf_agg_local_no_data_cells. - Breaking:
CellHistogramno longer carries along approximate statistics, due to confusing behavior. Userf_agg_statsinstead. - Introduced
LocalCellStatisticsclass to wrap together results fromLocalStatsAggregate. - Breaking:
TileDimensionsmoved fromastraea.spark.rasterframestoorg.locationtech.rasterframes.model. - Breaking: Renamed
RasterFrame.withBoundstoRasterFrameLayer.withGeometryfor consistency with DataSource schemas.
Known issues
- #188: Error on deserialization of a
Tilewith aboolcell type to the Python side; see issue description for work around.
0.7.x
0.7.1
- Fixed ColorRamp pipeline in MultibandRender
- Fixed Python wrapper for
explodeTiles
0.7.0
- Now an incubating project under Eclipse Foundation LocationTech! GitHub repo moved to locationtech/rasterframes.
- PySpark support! See
pyrasterframes/python/README.rstto get started. - Exposed Spark JTS spatial operations in Python.
- Added RasterFrames-enabled Jupyter Notebook Docker Container package. See
deployment/README.mdfor details. - Updated to GeoMesa version 2.0.1.
- Added
convertCellType,normalizedDifferencemaskandinverseMaskoperations on tile columns. - Added tile column + scalar operations:
localAddScalar,localSubtractScalar,localMultiplyScalar,localDivideScalar - Added
rasterizeandreprojectGeometryoperations on geometry columns. - Added for for writing GeoTIFFs from RasterFrames via
DataFrameWriter. - Added
spark.read.geotrellis.withNumPartitions(Int)for setting the initial number of partitions to use when reading a layer. - Added
spark.read.geotrellis.withTileSubdivisions(Int)for evenly subdividing tiles before they become rows in a RasterFrame. - Added
experimentalpackage for sandboxing new feature ideas. - Added experimental GeoJSON DataSource with schema inferfence on feature properties.
- Added Scala, SQL, and Python tile-scalar arithmetic operations:
localAddScalar,localSubtractScalar,localMultipyScalar,localDivideScalar. - Added Scala, SQL, and Python tile functions for logical comparisons both tile-tile and tile-scalar variants:
localLess,localLessEqual,localGreater,localGreaterEqual,localEqual, andlocalUnequal. - Added
SlippyExportexperimental feature for exporting the contents of a RasterFrame as a SlippyMap tile image directory structure and Leaflet/OpenMaps-enabled HTML file. - Added experimental DataSource implementations for MODIS and Landsat 8 catalogs on AWS PDS.
- Change: Default interpoation for
toRasterandtoMultibandRasterhas been changed fromBilineartoNearestNeighbor. - Breaking: Renamed/moved
astraea.spark.rasterframes.functions.CellStatsAggregateFunction.Statisticstoastraea.spark.rasterframes.stats.CellStatistics. - Breaking:
HistogramAggregateFunctionnow generates the new typeastraea.spark.rasterframes.stats.CellHistogram. - Breaking:
box2Drenamedenvelope.
0.6.x
0.6.1
- Added support for reading striped GeoTiffs (#64).
- Moved extension methods associated with querying tagged columns to
DataFrameMethodsfor supporting temporal and spatial columns on non-RasterFrame DataFrames. - GeoTIFF and GeoTrellis DataSources automatically initialize RasterFrames.
- Added
RasterFrame.toMultibandRaster. - Added utility for rendering multiband tile as RGB composite PNG.
- Added
RasterFrame.withRFColumnRenamedto lessen boilerplate in maintainingRasterFrametype tag.
0.6.0
- Upgraded to Spark 2.2.1. Added
VersionShimsto allow for Spark 2.1.x backwards compatibility. - Introduced separate
rasterframes-datasourcelibrary for hosting sources from which to read RasterFrames. - Implemented basic (but sufficient) temporal and spatial filter predicate push-down feature for the GeoTrellis layer datasource.
- Added Catalyst expressions specifically for spatial relations, allowing for some polymorphism over JTS types.
- Added a GeoTrellis Catalog
DataSourcefor inspecting available layers and associated metadata at a URI - Added GeoTrellis Layer DataSource for reading GeoTrellis layers from any SPI-registered GeoTrellis backend (which includes HDFS, S3, Accumulo, HBase, Cassandra, etc.).
- Ability to save a RasterFrame as a GeoTrellis layer to any SPI-registered GeoTrellis backends. Multi-column RasterFrames are written as Multiband tiles.
- Addd a GeoTiff DataSource for directly loading a (preferably Cloud Optimized) GeoTiff as a RasterFrame, each row containing tiles as they are internally organized.
- Fleshed out support for
MultibandTileandTileFeaturesupport in datasource. - Added typeclass for specifying merge operations on
TileFeaturedata payload. - Added
withTemporalComponentconvenince method for creating appending a temporal key column with constant value. - Breaking: Renamed
withExtenttowithBounds, and now returns a JTSPolygon. - Added
EnvelopeEncoderfor encoding JTSEnvelopetype. - Refactored build into separate
coreanddocs, paving way forpyrasterframespolyglot module. - Added utility extension method
withPrefixedColumnNamestoDataFrame.
Known Issues
- Writing multi-column RasterFrames to GeoTrellis layers requires all tiles to be of the same cell type.
0.5.x
0.5.12
- Added
withSpatialIndexto introduce a column assigning a z-curve index value based on the tile’s centroid in EPSG:4326. - Added column-appending convenience methods:
withExtent,withCenter,withCenterLatLng - Documented example of creating a GeoTrellis layer from a RasterFrame.
- Added Spark 2.2.0 forward-compatibility.
- Upgraded to GeoTrellis 1.2.0-RC2.
0.5.11
- Significant performance improvement in
explodeTiles(1-2 orders of magnitude). See #38 - Fixed bugs in
NoDatahandling when converting toDoubletiles.
0.5.10
0.5.9
- Ported to sbt 1.0.3
- Added sbt-generated
astraea.spark.rasterframes.RFBuildInfo - Fixed bug in computing
aggMeanwhen one or more tiles arenull - Deprecated
rfIinitin favor ofSparkSession.withRasterFramesorSQLContext.withRasterFramesextension methods
0.5.8
- Upgraded to GeoTrellis 1.2.0-RC1
- Added
REPLsent-based tour of RasterFrames - Moved Giter8 template to separate repository
s22s/raster-frames.g8due to sbt limitations - Updated Getting Started to reference new Giter8 repo
- Changed SQL function name
rf_statsandrf_histogramtorf_aggStatsandrf_aggHistogramfor consistency with DataFrames API
0.5.7
- Created faster implementation of aggregate statistics.
- Fixed bug in deserialization of
TileUDTs originating fromConstantTiles - Fixed bug in serialization of
NoDataFilterwithin SparkML pipeline - Refactoring of UDF organization
- Various documentation tweaks and updates
- Added Giter8 template
0.5.6
TileUDFs are encoded using directly into Catalyst–without Kryo–resulting in an insane decrease in serialization time for small tiles (int8, <= 128²), and pretty awesome speedup for all other cell types other thanfloat32(marginal slowing). While not measured, memory footprint is expected to have gone down.
0.5.5
aggStatsandtileMeanfunctions rewritten to compute simple statistics directly rather than usingStreamingHistogramtileHistogramDoubleandtileStatsDoublewere replaced bytileHistogramandtileStats- Added
tileSum,tileMinandtileMaxfunctions - Added
aggMean,aggDataCellsandaggNoDataCellsaggregate functions. - Added
localAggDataCellsandlocalAggNoDataCellscell-local (tile generating) fuctions - Added
tileToArrayandarrayToTile - Overflow fix in
LocalStatsAggregateFunction
0.9.1