Dask Integration#

The pyinterp.dask module provides helpers that compute the Statistics & Binning containers in parallel on Dask arrays. Each helper processes blocks independently using the same C++ kernels exposed by the in-memory API, then merges the per-block results with the += operator implemented by the underlying holders.

Dask is an optional dependency; install it with pip install dask[array].

Overview#

The functions in this module are thin wrappers that mirror the eager classes of pyinterp. The table below maps each Dask helper to the in-memory counterpart it produces:

Descriptive statistics & quantiles#

descriptive_statistics(values[, weights, ...])

Compute descriptive statistics on a dask array.

tdigest(values[, weights, axis, ...])

Compute quantile estimates on a dask array using T-Digest.

Binning containers#

binning1d(x, z, binning[, weights])

Accumulate values into 1D bins from a dask array.

binning2d(x, y, z, binning[, simple])

Accumulate values into 2D bins from dask arrays.

Histograms#

histogram2d(x, y, z, histogram)

Accumulate values into a 2D histogram from dask arrays.

See also#

  • Statistics & Binning — eager (in-memory) versions of the containers returned by the helpers above.