Search and massive download#

In this example, we will search for items using pygeodes, filter these items using geopandas dataframes and use a download queue to downloads these items and monitor the progress of the download.

Imports#

Let’s start by importing geodes

[1]:
from pygeodes import Geodes, Config

Configuration#

We configure using a config file located in our cwd

[2]:
conf = Config.from_file("config.json")
geodes = Geodes(conf=conf)

Searching products#

We search for products in the T31TCK tile whose acquisition date is after 2023-01-01

[3]:
from pygeodes.utils.datetime_utils import complete_datetime_from_str

query = {
    "spaceborne:tile": {"eq": "T31TCK"},
    "temporal:endDate": {"gte": complete_datetime_from_str("2023-01-01")},
}
items, dataframe = geodes.search_items(query=query)
Found 230 items matching your query

230 item(s) found for query : {'spaceborne:tile': {'eq': 'T31TCK'}, 'temporal:endDate': {'gte': '2023-01-01T00:00:00.000000Z'}}


Exploring results#

We get a list ot items and a dataframe, we can work with the dataframe for instance :

[4]:
dataframe
[4]:
temporal:endDate collection spaceborne:tile id item geometry
0 2024-03-08T10:57:59.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:05a48805-adbc-3990-a096-f... Item (URN:FEATURE:DATA:gdh:05a48805-adbc-3990-... POLYGON ((0.98016 45.13397, 0.45686 45.12551, ...
1 2024-03-05T10:48:19.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:243e16e2-ac59-3b62-adc5-8... Item (URN:FEATURE:DATA:gdh:243e16e2-ac59-3b62-... POLYGON ((0.45686 45.12551, 0.49965 44.13800, ...
2 2024-03-03T10:59:41.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:e6658cda-5f8a-3d19-b9a3-c... Item (URN:FEATURE:DATA:gdh:e6658cda-5f8a-3d19-... POLYGON ((0.94824 45.13346, 0.45686 45.12551, ...
3 2024-03-25T10:46:39.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:7ad8089d-ac8d-33d9-b8c0-e... Item (URN:FEATURE:DATA:gdh:7ad8089d-ac8d-33d9-... POLYGON ((0.45686 45.12551, 0.49965 44.13800, ...
4 2024-03-20T10:47:41.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:4ba686ff-6a4b-3e67-80b6-8... Item (URN:FEATURE:DATA:gdh:4ba686ff-6a4b-3e67-... POLYGON ((0.45686 45.12551, 0.49965 44.13800, ...
... ... ... ... ... ... ...
225 2023-02-24T10:50:31.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:3993b481-0d71-3896-9235-3... Item (URN:FEATURE:DATA:gdh:3993b481-0d71-3896-... POLYGON ((0.49965 44.13800, 1.87192 44.15980, ...
226 2023-03-21T10:46:49.025Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:8378ae65-ea65-331a-b414-c... Item (URN:FEATURE:DATA:gdh:8378ae65-ea65-331a-... POLYGON ((0.49965 44.13800, 1.87192 44.15980, ...
227 2023-03-19T10:57:51.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:1ac38aab-397a-3093-a9ad-b... Item (URN:FEATURE:DATA:gdh:1ac38aab-397a-3093-... POLYGON ((0.49965 44.13800, 0.55302 44.13884, ...
228 2023-02-22T10:59:49.025Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:339b2fdd-a973-3c34-9c6e-b... Item (URN:FEATURE:DATA:gdh:339b2fdd-a973-3c34-... POLYGON ((0.49965 44.13800, 0.55960 44.13895, ...
229 2023-03-06T10:49:21.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:51042660-e4b6-39da-aec7-0... Item (URN:FEATURE:DATA:gdh:51042660-e4b6-39da-... POLYGON ((0.49965 44.13800, 1.87192 44.15980, ...

230 rows × 6 columns

Adding columns#

We want to filter on cloudcover, so we need to add the column to the dataframe.

[5]:
items[0].list_available_keys()
[5]:
{'accessService:endpointDescription',
 'accessService:endpointURL',
 'dataType',
 'datetime',
 'id',
 'latest',
 'spaceborne:absoluteOrbitID',
 'spaceborne:cloudCover',
 'spaceborne:continentsID',
 'spaceborne:keywords',
 'spaceborne:orbitDirection',
 'spaceborne:orbitID',
 'spaceborne:political.continents',
 'spaceborne:productLevel',
 'spaceborne:productTimeliness',
 'spaceborne:productType',
 'spaceborne:references',
 'spaceborne:satellitePlatform',
 'spaceborne:satelliteSensor',
 'spaceborne:sensorMode',
 'spaceborne:tile',
 'spatial:bbox',
 'temporal:endDate',
 'temporal:startDate',
 'versionInfo'}

We find we can use spaceborne:cloudCover, so we add it to the dataframe :

[6]:
from pygeodes.utils.formatting import format_items

dataframe_new = format_items(dataframe, {"spaceborne:cloudCover"})

Filtering our results#

Now that the cloud cover is in our dataframe, we can filter on it.

[7]:
dataframe_filtered = dataframe_new[dataframe_new["spaceborne:cloudCover"] < 30]
[8]:
dataframe_filtered
[8]:
temporal:endDate collection spaceborne:tile id item geometry spaceborne:cloudCover
6 2024-05-09T10:50:31.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:bee1e3b7-84b7-332d-a06f-5... Item (URN:FEATURE:DATA:gdh:bee1e3b7-84b7-332d-... POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... 0.000000
7 2024-02-17T11:00:19.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:5c1379a6-702e-3f27-9cc0-2... Item (URN:FEATURE:DATA:gdh:5c1379a6-702e-3f27-... POLYGON ((0.94737 45.13344, 0.45686 45.12551, ... 3.435898
8 2024-01-20T10:53:41.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:084acd27-8d81-3dd1-adac-6... Item (URN:FEATURE:DATA:gdh:084acd27-8d81-3dd1-... POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... 0.553047
9 2024-04-12T10:56:21.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:c3aa6a8a-c476-38d8-98a6-1... Item (URN:FEATURE:DATA:gdh:c3aa6a8a-c476-38d8-... POLYGON ((0.95063 45.13349, 0.45686 45.12551, ... 9.792046
14 2023-12-14T11:04:41.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:feabfb13-c300-3e2b-af2f-1... Item (URN:FEATURE:DATA:gdh:feabfb13-c300-3e2b-... POLYGON ((0.96381 45.13371, 0.45686 45.12551, ... 0.000000
... ... ... ... ... ... ... ...
212 2023-05-18T10:56:21.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:6145129d-6c0d-3281-8888-a... Item (URN:FEATURE:DATA:gdh:6145129d-6c0d-3281-... POLYGON ((0.49965 44.13800, 0.54921 44.13878, ... 25.458779
216 2023-05-28T10:56:21.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:77280154-2e39-387c-998e-9... Item (URN:FEATURE:DATA:gdh:77280154-2e39-387c-... POLYGON ((0.49965 44.13800, 0.54324 44.13869, ... 0.000000
217 2023-05-30T10:46:29.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:3aa7fac0-a6f1-3bb4-b4ec-a... Item (URN:FEATURE:DATA:gdh:3aa7fac0-a6f1-3bb4-... POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... 29.690137
219 2023-02-02T11:01:59.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:a5133774-bbf3-33de-a735-2... Item (URN:FEATURE:DATA:gdh:a5133774-bbf3-33de-... POLYGON ((0.49965 44.13800, 0.56934 44.13910, ... 0.445192
222 2023-02-04T10:52:41.024Z PEPS_S2_L1C T31TCK URN:FEATURE:DATA:gdh:bf33c962-58ea-3907-b3a8-1... Item (URN:FEATURE:DATA:gdh:bf33c962-58ea-3907-... POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... 0.292335

73 rows × 7 columns

Plotting#

We can plot our results on a map :

[9]:
dataframe_filtered.explore()
[9]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Downloading our items#

We can download our results using the Profile system

[10]:
from pygeodes.utils.profile import DownloadQueue, Profile

We reset our Profile to be sure to track only the downloads from the queue

[11]:
Profile.reset()
items = dataframe_filtered["item"].values
queue = DownloadQueue(items)

In a separate cell, we run our queue

[ ]:
queue.run()