Search and massive download#
In this example, we will search for items using pygeodes, filter these items using geopandas dataframes and use a download queue to downloads these items and monitor the progress of the download.
Imports#
Let’s start by importing geodes
[1]:
from pygeodes import Geodes, Config
Configuration#
We configure using a config file located in our cwd
[2]:
conf = Config.from_file("config.json")
geodes = Geodes(conf=conf)
Searching products#
We search for products in the T31TCK tile whose acquisition date is after 2023-01-01
[3]:
from pygeodes.utils.datetime_utils import complete_datetime_from_str
query = {
"spaceborne:tile": {"eq": "T31TCK"},
"temporal:endDate": {"gte": complete_datetime_from_str("2023-01-01")},
}
items, dataframe = geodes.search_items(query=query)
Found 230 items matching your query
230 item(s) found for query : {'spaceborne:tile': {'eq': 'T31TCK'}, 'temporal:endDate': {'gte': '2023-01-01T00:00:00.000000Z'}}
Exploring results#
We get a list ot items and a dataframe, we can work with the dataframe for instance :
[4]:
dataframe
[4]:
temporal:endDate | collection | spaceborne:tile | id | item | geometry | |
---|---|---|---|---|---|---|
0 | 2024-03-08T10:57:59.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:05a48805-adbc-3990-a096-f... | Item (URN:FEATURE:DATA:gdh:05a48805-adbc-3990-... | POLYGON ((0.98016 45.13397, 0.45686 45.12551, ... |
1 | 2024-03-05T10:48:19.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:243e16e2-ac59-3b62-adc5-8... | Item (URN:FEATURE:DATA:gdh:243e16e2-ac59-3b62-... | POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... |
2 | 2024-03-03T10:59:41.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:e6658cda-5f8a-3d19-b9a3-c... | Item (URN:FEATURE:DATA:gdh:e6658cda-5f8a-3d19-... | POLYGON ((0.94824 45.13346, 0.45686 45.12551, ... |
3 | 2024-03-25T10:46:39.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:7ad8089d-ac8d-33d9-b8c0-e... | Item (URN:FEATURE:DATA:gdh:7ad8089d-ac8d-33d9-... | POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... |
4 | 2024-03-20T10:47:41.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:4ba686ff-6a4b-3e67-80b6-8... | Item (URN:FEATURE:DATA:gdh:4ba686ff-6a4b-3e67-... | POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... |
... | ... | ... | ... | ... | ... | ... |
225 | 2023-02-24T10:50:31.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:3993b481-0d71-3896-9235-3... | Item (URN:FEATURE:DATA:gdh:3993b481-0d71-3896-... | POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... |
226 | 2023-03-21T10:46:49.025Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:8378ae65-ea65-331a-b414-c... | Item (URN:FEATURE:DATA:gdh:8378ae65-ea65-331a-... | POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... |
227 | 2023-03-19T10:57:51.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:1ac38aab-397a-3093-a9ad-b... | Item (URN:FEATURE:DATA:gdh:1ac38aab-397a-3093-... | POLYGON ((0.49965 44.13800, 0.55302 44.13884, ... |
228 | 2023-02-22T10:59:49.025Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:339b2fdd-a973-3c34-9c6e-b... | Item (URN:FEATURE:DATA:gdh:339b2fdd-a973-3c34-... | POLYGON ((0.49965 44.13800, 0.55960 44.13895, ... |
229 | 2023-03-06T10:49:21.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:51042660-e4b6-39da-aec7-0... | Item (URN:FEATURE:DATA:gdh:51042660-e4b6-39da-... | POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... |
230 rows × 6 columns
Adding columns#
We want to filter on cloudcover, so we need to add the column to the dataframe.
[5]:
items[0].list_available_keys()
[5]:
{'accessService:endpointDescription',
'accessService:endpointURL',
'dataType',
'datetime',
'id',
'latest',
'spaceborne:absoluteOrbitID',
'spaceborne:cloudCover',
'spaceborne:continentsID',
'spaceborne:keywords',
'spaceborne:orbitDirection',
'spaceborne:orbitID',
'spaceborne:political.continents',
'spaceborne:productLevel',
'spaceborne:productTimeliness',
'spaceborne:productType',
'spaceborne:references',
'spaceborne:satellitePlatform',
'spaceborne:satelliteSensor',
'spaceborne:sensorMode',
'spaceborne:tile',
'spatial:bbox',
'temporal:endDate',
'temporal:startDate',
'versionInfo'}
We find we can use spaceborne:cloudCover
, so we add it to the dataframe :
[6]:
from pygeodes.utils.formatting import format_items
dataframe_new = format_items(dataframe, {"spaceborne:cloudCover"})
Filtering our results#
Now that the cloud cover is in our dataframe, we can filter on it.
[7]:
dataframe_filtered = dataframe_new[dataframe_new["spaceborne:cloudCover"] < 30]
[8]:
dataframe_filtered
[8]:
temporal:endDate | collection | spaceborne:tile | id | item | geometry | spaceborne:cloudCover | |
---|---|---|---|---|---|---|---|
6 | 2024-05-09T10:50:31.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:bee1e3b7-84b7-332d-a06f-5... | Item (URN:FEATURE:DATA:gdh:bee1e3b7-84b7-332d-... | POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... | 0.000000 |
7 | 2024-02-17T11:00:19.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:5c1379a6-702e-3f27-9cc0-2... | Item (URN:FEATURE:DATA:gdh:5c1379a6-702e-3f27-... | POLYGON ((0.94737 45.13344, 0.45686 45.12551, ... | 3.435898 |
8 | 2024-01-20T10:53:41.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:084acd27-8d81-3dd1-adac-6... | Item (URN:FEATURE:DATA:gdh:084acd27-8d81-3dd1-... | POLYGON ((0.45686 45.12551, 0.49965 44.13800, ... | 0.553047 |
9 | 2024-04-12T10:56:21.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:c3aa6a8a-c476-38d8-98a6-1... | Item (URN:FEATURE:DATA:gdh:c3aa6a8a-c476-38d8-... | POLYGON ((0.95063 45.13349, 0.45686 45.12551, ... | 9.792046 |
14 | 2023-12-14T11:04:41.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:feabfb13-c300-3e2b-af2f-1... | Item (URN:FEATURE:DATA:gdh:feabfb13-c300-3e2b-... | POLYGON ((0.96381 45.13371, 0.45686 45.12551, ... | 0.000000 |
... | ... | ... | ... | ... | ... | ... | ... |
212 | 2023-05-18T10:56:21.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:6145129d-6c0d-3281-8888-a... | Item (URN:FEATURE:DATA:gdh:6145129d-6c0d-3281-... | POLYGON ((0.49965 44.13800, 0.54921 44.13878, ... | 25.458779 |
216 | 2023-05-28T10:56:21.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:77280154-2e39-387c-998e-9... | Item (URN:FEATURE:DATA:gdh:77280154-2e39-387c-... | POLYGON ((0.49965 44.13800, 0.54324 44.13869, ... | 0.000000 |
217 | 2023-05-30T10:46:29.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:3aa7fac0-a6f1-3bb4-b4ec-a... | Item (URN:FEATURE:DATA:gdh:3aa7fac0-a6f1-3bb4-... | POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... | 29.690137 |
219 | 2023-02-02T11:01:59.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:a5133774-bbf3-33de-a735-2... | Item (URN:FEATURE:DATA:gdh:a5133774-bbf3-33de-... | POLYGON ((0.49965 44.13800, 0.56934 44.13910, ... | 0.445192 |
222 | 2023-02-04T10:52:41.024Z | PEPS_S2_L1C | T31TCK | URN:FEATURE:DATA:gdh:bf33c962-58ea-3907-b3a8-1... | Item (URN:FEATURE:DATA:gdh:bf33c962-58ea-3907-... | POLYGON ((0.49965 44.13800, 1.87192 44.15980, ... | 0.292335 |
73 rows × 7 columns
Plotting#
We can plot our results on a map :
[9]:
dataframe_filtered.explore()
[9]:
Downloading our items#
We can download our results using the Profile system
[10]:
from pygeodes.utils.profile import DownloadQueue, Profile
We reset our Profile to be sure to track only the downloads from the queue
[11]:
Profile.reset()
items = dataframe_filtered["item"].values
queue = DownloadQueue(items)
In a separate cell, we run our queue
[ ]:
queue.run()