| Title: | Bindings for Apache SedonaDB |
|---|---|
| Description: | Provides bindings for Apache SedonaDB, a lightweight query engine optimized for spatial workflows. |
| Authors: | Dewey Dunnington [aut, cre] |
| Maintainer: | Dewey Dunnington <[email protected]> |
| License: | Apache License (>= 2) |
| Version: | 0.2.0 |
| Built: | 2025-12-06 04:00:12 UTC |
| Source: | https://github.com/apache/sedona-db |
Convert an object to a DataFrame
as_sedonadb_dataframe(x, ..., schema = NULL)as_sedonadb_dataframe(x, ..., schema = NULL)
x |
An object to convert |
... |
Extra arguments passed to/from methods |
schema |
The requested schema |
A sedonadb_dataframe
as_sedonadb_dataframe(data.frame(x = 1:3))as_sedonadb_dataframe(data.frame(x = 1:3))
Use sd_compute() to collect and return the result as a DataFrame;
use sd_collect() to collect and return the result as an R data.frame.
sd_compute(.data) sd_collect(.data, ptype = NULL)sd_compute(.data) sd_collect(.data, ptype = NULL)
.data |
A sedonadb_dataframe |
ptype |
The target R object. See nanoarrow::convert_array_stream. |
sd_compute() returns a sedonadb_dataframe; sd_collect() returns
a data.frame (or subclass according to ptype).
sd_sql("SELECT 1 as one") |> sd_compute() sd_sql("SELECT 1 as one") |> sd_collect()sd_sql("SELECT 1 as one") |> sd_compute() sd_sql("SELECT 1 as one") |> sd_collect()
Performs a runtime configuration of PROJ, which can be used in place of a build-time linked version of PROJ or to add in support if PROJ was not linked at build time.
sd_configure_proj( preset = NULL, shared_library = NULL, database_path = NULL, search_path = NULL )sd_configure_proj( preset = NULL, shared_library = NULL, database_path = NULL, search_path = NULL )
preset |
One of:
|
shared_library |
An absolute or relative path to a shared library valid for the platform. |
database_path |
A path to proj.db |
search_path |
A path to the data files required by PROJ for some transforms. |
NULL, invisibly
sd_configure_proj("auto")sd_configure_proj("auto")
Count rows in a DataFrame
sd_count(.data)sd_count(.data)
.data |
A sedonadb_dataframe |
The number of rows after executing the query
sd_sql("SELECT 1 as one") |> sd_count()sd_sql("SELECT 1 as one") |> sd_count()
Remove a view created with sd_to_view() from the context.
sd_drop_view(table_ref) sd_view(table_ref)sd_drop_view(table_ref) sd_view(table_ref)
table_ref |
The name of the view reference |
The context, invisibly
sd_sql("SELECT 1 as one") |> sd_to_view("foofy") sd_view("foofy") sd_drop_view("foofy") try(sd_view("foofy"))sd_sql("SELECT 1 as one") |> sd_to_view("foofy") sd_view("foofy") sd_drop_view("foofy") try(sd_view("foofy"))
This is used to implement print() for the sedonadb_dataframe or can
be used to explicitly preview if options(sedonadb.interactive = FALSE).
sd_preview(.data, n = NULL, ascii = NULL, width = NULL)sd_preview(.data, n = NULL, ascii = NULL, width = NULL)
.data |
A sedonadb_dataframe |
n |
The number of rows to preview. Use |
ascii |
Use |
width |
The character width of the output. Defaults to
|
.data, invisibly
sd_sql("SELECT 1 as one") |> sd_preview()sd_sql("SELECT 1 as one") |> sd_preview()
The query will only be executed when requested.
sd_read_parquet(path)sd_read_parquet(path)
path |
One or more paths or URIs to Parquet files |
A sedonadb_dataframe
path <- system.file("files/natural-earth_cities_geo.parquet", package = "sedonadb") sd_read_parquet(path) |> head(5) |> sd_preview()path <- system.file("files/natural-earth_cities_geo.parquet", package = "sedonadb") sd_read_parquet(path) |> head(5) |> sd_preview()
Several types of user-defined functions can be registered into a session
context. Currently, the only implemented variety is an external pointer
to a Rust FFI_ScalarUDF, an example of which is available from the
DataFusion Python documentation.
sd_register_udf(udf)sd_register_udf(udf)
udf |
An object of class 'datafusion_scalar_udf' |
NULL, invisibly
The query will only be executed when requested.
sd_sql(sql)sd_sql(sql)
sql |
A SQL string to execute |
A sedonadb_dataframe
sd_sql("SELECT ST_Point(0, 1) as geom") |> sd_preview()sd_sql("SELECT ST_Point(0, 1) as geom") |> sd_preview()
This is useful for creating a view that can be referenced in a SQL
statement. Use sd_drop_view() to remove it.
sd_to_view(.data, table_ref, overwrite = FALSE)sd_to_view(.data, table_ref, overwrite = FALSE)
.data |
A sedonadb_dataframe |
table_ref |
The name of the view reference |
overwrite |
Use TRUE to overwrite a view with the same name (if it exists) |
.data, invisibly
sd_sql("SELECT 1 as one") |> sd_to_view("foofy") sd_sql("SELECT * FROM foofy")sd_sql("SELECT 1 as one") |> sd_to_view("foofy") sd_sql("SELECT * FROM foofy")
Write this DataFrame to one or more (Geo)Parquet files. For input that contains geometry columns, GeoParquet metadata is written such that suitable readers can recreate Geometry/Geography types when reading the output and potentially read fewer row groups when only a subset of the file is needed for a given query.
sd_write_parquet( .data, path, partition_by = character(0), sort_by = character(0), single_file_output = NULL, geoparquet_version = "1.0", overwrite_bbox_columns = FALSE )sd_write_parquet( .data, path, partition_by = character(0), sort_by = character(0), single_file_output = NULL, geoparquet_version = "1.0", overwrite_bbox_columns = FALSE )
.data |
A sedonadb_dataframe |
path |
A filename or directory to which parquet file(s) should be written |
partition_by |
A character vector of column names to partition by. If non-empty, applies hive-style partitioning to the output |
sort_by |
A character vector of column names to sort by. Currently only ascending sort is supported |
single_file_output |
Use TRUE or FALSE to force writing a single Parquet
file vs. writing one file per partition to a directory. By default,
a single file is written if |
geoparquet_version |
GeoParquet metadata version to write if output contains one or more geometry columns. The default ("1.0") is the most widely supported and will result in geometry columns being recognized in many readers; however, only includes statistics at the file level. Use "1.1" to compute an additional bounding box column for every geometry column in the output: some readers can use these columns to prune row groups when files contain an effective spatial ordering. The extra columns will appear just before their geometry column and will be named "geom_col_name_bbox" for all geometry columns except "geometry", whose bounding box column name is just "bbox" |
overwrite_bbox_columns |
Use TRUE to overwrite any bounding box columns that already exist in the input. This is useful in a read -> modify -> write scenario to ensure these columns are up-to-date. If FALSE (the default), an error will be raised if a bbox column already exists |
The input, invisibly
tmp_parquet <- tempfile(fileext = ".parquet") sd_sql("SELECT ST_SetSRID(ST_Point(1, 2), 4326) as geom") |> sd_write_parquet(tmp_parquet) sd_read_parquet(tmp_parquet) unlink(tmp_parquet)tmp_parquet <- tempfile(fileext = ".parquet") sd_sql("SELECT ST_SetSRID(ST_Point(1, 2), 4326) as geom") |> sd_write_parquet(tmp_parquet) sd_read_parquet(tmp_parquet) unlink(tmp_parquet)
SedonaDB ADBC Driver
sedonadb_adbc()sedonadb_adbc()
An adbcdrivermanager::adbc_driver() of class
'sedonadb_driver_sedonadb'
library(adbcdrivermanager) con <- sedonadb_adbc() |> adbc_database_init() |> adbc_connection_init() con |> read_adbc("SELECT ST_Point(0, 1) as geometry") |> as.data.frame()library(adbcdrivermanager) con <- sedonadb_adbc() |> adbc_database_init() |> adbc_connection_init() con |> read_adbc("SELECT ST_Point(0, 1) as geometry") |> as.data.frame()