Parallel GeoPandas with Dask
Project description
Parallel GeoPandas with Dask
Status
EXPERIMENTAL This project is in an early state.
If you would like to see this project in a more stable state, then you might consider pitching in with developer time (contributions are very welcome!) or with financial support from you or your company.
This is a new project that builds off the exploration done in https://github.com/mrocklin/dask-geopandas
Documentation
See the documentation on https://dask-geopandas.readthedocs.io/en/latest/
Installation
This package depends on GeoPandas, Dask and PyGEOS.
One way to install all required dependencies is to use the conda package manager to create a new environment:
conda create -n geo_env conda activate geo_env conda config --env --add channels conda-forge conda config --env --set channel_priority strict conda install dask-geopandas
Example
Given a GeoPandas dataframe
import geopandas
df = geopandas.read_file('...')
We can repartition it into a Dask-GeoPandas dataframe:
import dask_geopandas
ddf = dask_geopandas.from_geopandas(df, npartitions=4)
The familiar spatial attributes and methods of GeoPandas are also available and will be computed in parallel:
ddf.geometry.area.compute()
ddf.within(polygon)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dask_geopandas-0.1.0a7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b33aa112b927bf0720e7a46ef8bcef2d8409f71f9e3d3616c8f186643cd6eb38 |
|
MD5 | d73262a6f865de6696903de3aa046839 |
|
BLAKE2b-256 | 68cb24bee8624a56aabbb29101ef5849da6fcfe107fd68beac6f4726d2506ecd |