Skip to content

arctern-0.1.0

Compare
Choose a tag to compare
@neza2017 neza2017 released this 26 Apr 04:46
· 307 commits to master since this release
ce66e70

Arctern 0.1.0 Release Notes

This release mainly focuses on developing the geospatial library and integrating the analytic engine. So far, Arctern has introduced 38 geospatial analytic functions and five geospatial rendering functions. It provides Python bindings and PySpark SQL UDF integration for these functions. A set of RESTful APIs is also provided for accessing Arctern's backend.

Arctern 0.1.0 offers GPU acceleration for time-consuming function calls, involving six geospatial analytic functions and five geospatial rendering functions.

The CPU-based and GPU-based implementations share the same interface despite their underling implementation differences.

This release discusses important aspects in Arctern v0.1.0, including APIs, experimental features, and limitations.

Pandas & Spark APIs

In this release, most geospatial analytic functions are CPU-based functions based on GDAL with a batch of improvements, while six other functions (ST_Point, ST_Area, ST_Envelop, ST_Length, ST_Distance, ST_Within) adopt GPU-accelerated implementations to enhance the computational performance. Compared with its counterparts, Arctern 0.1.0 shows promising results in the aspect of computing power and speed.

In the upcoming releases, we plan on adding more functions to expand Actern's analytic capabilities and optimizing its performance.

Constructor Functions

  • ST_Point: Builds a Point based on the given horizontal and vertical coordinates.
  • ST_PolygonFromEnvelope: Constructs a rectangular Geometry based on the given parameters.
  • ST_GeomFromGeoJSON: Constructs a Geometry from the GeoJson strings.
  • ST_PointFromText: Converts the given Point from WKT format to WKB format. (Spark only)
  • ST_PolygonFromText: Converts the given Polygon from WKT format to WKB format. (Spark only)
  • ST_LineStringFromText: Converts the given LineString from WKT format to WKB format. (Spark only)
  • ST_GeomFromText: Converts the given Geometry from WKT format to WKB format.
  • ST_GeomFromWKT: Converts the given Geometry from WKT format to WKB format. (Spark only)
  • ST_AsText: Converts the given Geometry from WKB format to WKT format.
  • ST_AsGeoJSON: Converts the given Geometry from WKB format to GeoJSON format.

Accessor Functions

  • ST_IsValid: Checks if the given Geometry is valid.
  • ST_IsSimple: Checks if the given Geometry is simple, which means it has no abnormal points, such as self-intersection and self-tangency.
  • ST_GeometryType: Returns a string representing the type of each Geometry in the input.
  • ST_NPoints: Counts the number of vertices/end points in a given Geometry.
  • ST_Envelope: Calculates the smallest rectangle containing the given Geometry.

Processing Functions

  • ST_Buffer: Returns a Geometry, the maximum distance between which and the given Geometry is not greater than the given distance.
  • ST_PrecisionReduce: Reduces the coordinate precision of the given Geometry based on the given number of significant digits.
  • ST_Intersection: Calculates the intersection of the two given Geometries.
  • ST_MakeValid: Constructs the given Geometry as a valid Geometry without removing any vertices.
  • ST_SimplifyPreserveTopology: Uses polylines to approximate curves in the given Geometry through the Douglas-Peucker algorithm.
  • ST_Centroid: Calculates the centroid of the given Geometry.
  • ST_ConvexHull: Calculates the smallest convex Geometry that contains the given Geometry.
  • ST_Transform: Maps the coordinates of the given Geometry from the "src_rc" space coordinate system (SRID) to the "dst_rs" space coordinate system.
  • ST_CurveToLine: Converts curves in the given Geometry to approximate linear representations, such as converting CIRCULAR STRING to LINESTRING, CURVEPOLYGON to POLYGON, and MULTISURFACE to MULTIPOLYGON.

Measurement Functions

  • ST_DistanceSphere: Calculates the spherical distance between two given Points on the earth's surface based on their latitude and longitude coordinates.
  • ST_Distance: Calculates the distance between the two given Geometries.
  • ST_Area: Calculates the area of the given Geometry.
  • ST_Length: Calculates the length of the given linear Geometry.
  • ST_HausdorffDistance: Returns the Hausdorff distance between the two given Geometries. The Hausdorff distance is used to measure the similarity between two Geometries.

Relationship Functions

  • ST_Equals: Checks if the two given Geometries are equivalent, which means they represent the same Geometry.
  • ST_Touches: Checks if the two given Geometries are adjacent, which means they have common points on the boundary.
  • ST_Overlaps: Checks if the two given Geometries overlap each other, which means they intersect and neither of them completely contains the other.
  • ST_Crosses: Checks if the two given Geometries cross each other, which means they share some but not all of the internal points. The intersection of these two Geometries cannot be empty, and the dimension of the intersection is smaller than the largest dimension of the input Geometry.
  • ST_Contains: Checks if the Geometry geo1 contains the Geometry geo2, which means geo2 has no point outside geo1 and at least one point inside geo1.
  • ST_Intersects: Checks if the two given Geometries intersect, which means they share the common space.
  • ST_Within: Checks if the Geometry geo1 is inside the Geometry geo2, which means geo1 has no point outside geo2 and at least one point inside geo2.

Aggregation Functions

  • ST_Union_Aggr: Returns a Geometry representing the given union set of Geometries.
  • ST_Envelope_Aggr: Calculates the minimum rectangle that contains the given set of Geometries.

Rendering Functions

This release supports the following five rendering functions, all having both the CPU-based and the GPU-based implementations:

  • pointmap: Draws a point map for WKB-formatted Points.
  • weighted_pointmap: Draws a weighted point map for WKB-formatted Points.
  • heatmap: Draws a heat map for WKB-formatted Points.
  • choroplehtmap: Draws a choropleth map for WKB-formatted Points that forms the contours of Polygons.
  • icon_viz: Draws an icon map for WKB-formatted Points.

Each function returns an image in PNG format. You can overlap these images to create stacked multi-layer effects.

PySpark SQL Integration

This release provides integration with PySpark SQL. All the 38 geospatial analytic functions mentioned above can be called as a SQL UDF (or nested UDFs). For more details, see Arctern 0.1.0 PySpark APIs.

Limitations

Due to the limitations of PySpark's UDF framework, the ST_Union_Aggr and ST_Envelope_Aggr functions may show poor performance when dealing with large data sets. This will be solved with the coming of the Dataframe/Series interface in 0.2.0.

RESTful APIs

This release only supports setting PySpark as Arctern's RESTful backend. The currently supported RESTful APIs are as follows. For more details, see Arctern 0.1.0 RESTful APIs.

  • POST /scope
  • DELETE /scope/
  • POST /loadfile
  • POST /savefile
  • GET /table/schema?scope=scope1&session=spark&table=table1
  • POST /query
  • POST /pointmap
  • POST /weighted_pointmap
  • POST /heatmap
  • POST /choroplethmap
  • POST /command