.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/stats/ex_binning.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_stats_ex_binning.py: .. _example_binning: Binning ======= Binning is a technique used to group continuous values into a smaller number of bins. This is particularly useful when you have irregularly distributed data and want to analyze it on a regular grid. In this example, we will use pyinterp's 2D binning functionality to calculate drifter velocity statistics in the Black Sea over a 9-year period. .. GENERATED FROM PYTHON SOURCE LINES 13-21 .. code-block:: Python import cartopy.crs import matplotlib.pyplot import numpy import pyinterp import pyinterp.backends.xarray import pyinterp.tests .. GENERATED FROM PYTHON SOURCE LINES 22-27 Loading the Data ---------------- First, we load the drifter data, which includes longitude, latitude, and velocity components (u and v). .. GENERATED FROM PYTHON SOURCE LINES 27-29 .. code-block:: Python ds = pyinterp.tests.load_aoml() .. GENERATED FROM PYTHON SOURCE LINES 30-31 We then calculate the velocity magnitude from the u and v components. .. GENERATED FROM PYTHON SOURCE LINES 31-33 .. code-block:: Python norm = (ds.ud**2 + ds.vd**2)**0.5 .. GENERATED FROM PYTHON SOURCE LINES 34-39 Defining the Grid ----------------- Next, we define the 2D grid on which we will bin the data. The grid is defined by two axes: one for longitude and one for latitude. .. GENERATED FROM PYTHON SOURCE LINES 39-45 .. code-block:: Python binning = pyinterp.Binning2D( pyinterp.Axis(numpy.arange(27, 42, 0.3, dtype=numpy.float64), is_circle=True), pyinterp.Axis(numpy.arange(40, 47, 0.3, dtype=numpy.float64))) print(binning) .. rst-class:: sphx-glr-script-out .. code-block:: none Axis: x: min_value: 27 max_value: 41.7 step : 0.3 is_circle: false y: min_value: 40 max_value: 46.9 step : 0.3 is_circle: false .. GENERATED FROM PYTHON SOURCE LINES 46-52 Simple Binning -------------- With simple binning, each data point is assigned to the bin that contains its coordinates. We push the data into the bins and then compute the mean of the values in each bin. .. GENERATED FROM PYTHON SOURCE LINES 52-56 .. code-block:: Python binning.clear() binning.push(ds.lon, ds.lat, norm, True) simple_mean = binning.variable('mean') .. GENERATED FROM PYTHON SOURCE LINES 57-69 .. note:: For datasets larger than the available RAM, you can use Dask for parallel computation. The :py:meth:`push_delayed ` method returns a Dask graph, which can be computed to get the result. .. code:: python binning = binning.push_delayed(lon, lat, data).compute() You can also compute other statistical variables like variance, minimum, and maximum using the :py:meth:`variable ` method. .. GENERATED FROM PYTHON SOURCE LINES 71-77 Linear Binning -------------- Linear binning is a more advanced technique where each data point contributes to the four nearest bins, weighted by its distance to the center of each bin. This generally produces a smoother result. .. GENERATED FROM PYTHON SOURCE LINES 77-81 .. code-block:: Python binning.clear() binning.push(ds.lon, ds.lat, norm, False) linear_mean = binning.variable('mean') .. GENERATED FROM PYTHON SOURCE LINES 82-86 Visualizing the Results ----------------------- Finally, we visualize the results of both simple and linear binning. .. GENERATED FROM PYTHON SOURCE LINES 86-116 .. code-block:: Python fig = matplotlib.pyplot.figure(figsize=(10, 8)) fig.subplots_adjust(left=0.05, right=0.95, top=0.95, bottom=0.05, hspace=0.25) ax1 = fig.add_subplot(211, projection=cartopy.crs.PlateCarree()) lon, lat = numpy.meshgrid(binning.x, binning.y, indexing='ij') pcm = ax1.pcolormesh(lon, lat, simple_mean, cmap='jet', shading='auto', vmin=0, vmax=1, transform=cartopy.crs.PlateCarree()) ax1.set_extent([27, 42, 40, 47], crs=cartopy.crs.PlateCarree()) ax1.coastlines() ax1.set_title('Simple Binning') ax2 = fig.add_subplot(212, projection=cartopy.crs.PlateCarree()) pcm = ax2.pcolormesh(lon, lat, linear_mean, cmap='jet', shading='auto', vmin=0, vmax=1, transform=cartopy.crs.PlateCarree()) ax2.set_extent([27, 42, 40, 47], crs=cartopy.crs.PlateCarree()) ax2.coastlines() ax2.set_title('Linear Binning') fig.colorbar(pcm, ax=[ax1, ax2], shrink=0.8) .. image-sg:: /auto_examples/stats/images/sphx_glr_ex_binning_001.png :alt: Simple Binning, Linear Binning :srcset: /auto_examples/stats/images/sphx_glr_ex_binning_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 117-125 Histogram2D ----------- The :py:class:`Histogram2D ` class is similar to the :py:class:`Binning2D ` class, but it calculates the histogram of the data in each bin instead of the statistics. Let's calculate the 2D histogram of the drifter data. .. GENERATED FROM PYTHON SOURCE LINES 125-131 .. code-block:: Python hist = pyinterp.Histogram2D( pyinterp.Axis(numpy.arange(27, 42, 0.3, dtype=numpy.float64), is_circle=True), pyinterp.Axis(numpy.arange(40, 47, 0.3, dtype=numpy.float64))) hist.push(ds.lon, ds.lat, norm) .. GENERATED FROM PYTHON SOURCE LINES 132-133 We can then visualize the histogram. .. GENERATED FROM PYTHON SOURCE LINES 133-146 .. code-block:: Python fig = matplotlib.pyplot.figure(figsize=(10, 4)) fig.subplots_adjust(left=0.05, right=0.95, top=0.95, bottom=0.05, hspace=0.25) ax1 = fig.add_subplot(111, projection=cartopy.crs.PlateCarree()) pcm = ax1.pcolormesh(lon, lat, hist.variable(), cmap='jet', shading='auto', transform=cartopy.crs.PlateCarree()) ax1.set_extent([27, 42, 40, 47], crs=cartopy.crs.PlateCarree()) ax1.coastlines() ax1.set_title('2D Histogram') fig.colorbar(pcm, ax=ax1, shrink=0.8) .. image-sg:: /auto_examples/stats/images/sphx_glr_ex_binning_002.png :alt: 2D Histogram :srcset: /auto_examples/stats/images/sphx_glr_ex_binning_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.613 seconds) .. _sphx_glr_download_auto_examples_stats_ex_binning.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/CNES/pangeo-pyinterp/master?urlpath=lab/tree/notebooks/auto_examples/stats/ex_binning.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: ex_binning.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: ex_binning.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: ex_binning.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_