Download from S3#

In this example, we’ll configure pygeodes to download images via S3 (rather than HTTPS). This can be useful if you are processing images on the CNES SI-S. In this case, retrieving the products throught S3 is more efficient than downloading them via HTTPS.

Search products#

Let’s search for products to download, by geometry, cloud cover and date. Just note that, in this example, we use complete_datetime_from_str to convert to geodes’s datetime format.

[1]:
from pygeodes import Geodes
from pygeodes.utils.datetime_utils import complete_datetime_from_str

geodes = Geodes()

date = complete_datetime_from_str("2025-11-01")

items, dataframe = geodes.search_items(
    collections=["PEPS_S2_L1C"],
    query={
        "grid:code": {"contains": "T52SCE"},
        "eo:cloud_cover": {"lte": 5},
        "end_datetime": {"gte": date},
    },
)
dataframe
/home/qt/robertm/Documents/pygeodes_user/pygeodes_env_user/lib/python3.10/site-packages/urllib3/connectionpool.py:1097: InsecureRequestWarning: Unverified HTTPS request is being made to host 'geodes-portal.cnes.fr'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Found 7 items matching your query, returning 7 as get_all parameter is set to False
7 item(s) found for query : {'grid:code': {'contains': 'T52SCE'}, 'eo:cloud_cover': {'lte': 5}, 'end_datetime': {'gte': '2025-11-01T00:00:00.000000Z'}}

[1]:
grid:code id end_datetime eo:cloud_cover collection item geometry
0 T52SCE URN:FEATURE:DATA:gdh:ce50df7f-11c2-328b-8f5f-7... 2025-11-22T02:19:19.024Z 0.045448 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:ce50df7f-11c2-328b-... POLYGON ((126.7776 36.12428, 126.80483 35.135,...
1 T52SCE URN:FEATURE:DATA:gdh:c95dc577-f604-3a70-98d4-6... 2025-12-02T02:19:49.024Z 0.788916 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:c95dc577-f604-3a70-... POLYGON ((126.7776 36.12428, 126.80483 35.135,...
2 T52SCE URN:FEATURE:DATA:gdh:37fd0133-1fe0-36b9-bc03-d... 2025-12-12T02:20:09.024Z 0.109648 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:37fd0133-1fe0-36b9-... POLYGON ((126.7776 36.12428, 126.80483 35.135,...
3 T52SCE URN:FEATURE:DATA:gdh:ddce8bfd-d6ff-3d39-acb0-a... 2025-11-07T02:19:11.025Z 4.056854 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:ddce8bfd-d6ff-3d39-... POLYGON ((126.7776 36.12428, 126.80483 35.135,...
4 T52SCE URN:FEATURE:DATA:gdh:d68b66fe-403c-36c6-a575-9... 2025-11-29T02:21:31.024Z 0.000000 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:d68b66fe-403c-36c6-... POLYGON ((126.7776 36.12428, 126.80483 35.135,...
5 T52SCE URN:FEATURE:DATA:gdh:40e7c95e-abcc-3883-9833-2... 2025-11-26T02:11:21.024Z 0.119523 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:40e7c95e-abcc-3883-... POLYGON ((127.46914 35.14374, 128.00972 35.150...
6 T52SCE URN:FEATURE:DATA:gdh:5daceb82-d26e-3cd7-b517-b... 2025-11-29T02:09:39.024Z 0.000000 PEPS_S2_L1C Item (URN:FEATURE:DATA:gdh:5daceb82-d26e-3cd7-... POLYGON ((127.48292 35.14392, 128.00972 35.150...

To learn more about how to search items, see the docs.

Configuration#

Let’s configure geodes, as we will want to use S3 downloading, we need to provide our AWS S3 credentials. To learn more about how to get CNES S3 credentials, see the docs. We also provide a “download_dir” to avoid overloading our current working directory

[10]:
from pygeodes import Config

conf = Config(
    download_dir="./downloads_s3",
    aws_access_key_id= "my_access_key_id",
    aws_secret_access_key= "my_secret_access_key_id",
    aws_session_token= "my_session_token"
)
geodes.set_conf(conf)

All these parameters are also configurable in the JSON configuration file under the following format :

{"aws_access_key_id" : "my_access_key_id","aws_secret_access_key" : "my_secret_access_key_id","aws_session_token" : "my_session_token","download_dir" : "/tmp/downloads"}

Download from S3#

As we provided an S3 conf, pygeodes will automatically use an S3 client instead of geodes to download your products (for further details, see the docs) :

[11]:
import time
start_time = time.time()

geodes.download_item_archives(items)

end_time = time.time()
print(f"Temps d'exécution : {end_time - start_time} secondes")
Downloading 7 items from S3: 1it [00:03,  3.19s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2B_MSIL1C_20251122T021919_N0511_R003_T52SCE_20251122T032926.zip
Downloading 7 items from S3: 2it [00:08,  4.15s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2B_MSIL1C_20251202T021949_N0511_R003_T52SCE_20251202T033210.zip
Downloading 7 items from S3: 3it [00:14,  4.99s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2B_MSIL1C_20251212T022009_N0511_R003_T52SCE_20251212T032952.zip
Downloading 7 items from S3: 4it [00:21,  6.13s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2C_MSIL1C_20251107T021911_N0511_R003_T52SCE_20251107T033338.zip
Downloading 7 items from S3: 5it [00:31,  7.33s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2A_MSIL1C_20251129T022131_N0511_R003_T52SCE_20251129T050905.zip
Downloading 7 items from S3: 6it [00:39,  7.75s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2A_MSIL1C_20251126T021121_N0511_R103_T52SCE_20251126T045920.zip
Downloading 7 items from S3: 7it [00:46,  6.66s/it]
Download from s3 completed at /work/scratch/data/robertm/public/tutos/downloads/S2B_MSIL1C_20251129T020939_N0511_R103_T52SCE_20251129T031932.zip
Temps d'exécution : 46.622666358947754 secondes

Here, the downloading is complete in 1 minute environ.

Compare with HTTPS downloading#

To download via HTTPS, we remove S3 credentials and log in with GEODES API key.

[15]:
conf = Config(
    download_dir="./downloads_https",
    api_key= "my_api_key"
)
geodes.set_conf(conf)

start_time = time.time()
geodes.download_item_archives(items)
end_time = time.time()
print(f"Temps d'exécution : {end_time - start_time} secondes")
Downloading 7 items from geodes

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2B_MSIL1C_20251122T021919_N0511_R003_T52SCE_20251122T032926.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2B_MSIL1C_20251202T021949_N0511_R003_T52SCE_20251202T033210.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2B_MSIL1C_20251212T022009_N0511_R003_T52SCE_20251212T032952.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2C_MSIL1C_20251107T021911_N0511_R003_T52SCE_20251107T033338.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2A_MSIL1C_20251129T022131_N0511_R003_T52SCE_20251129T050905.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2A_MSIL1C_20251126T021121_N0511_R103_T52SCE_20251126T045920.zip

Download completed at /work/scratch/data/robertm/public/tutos/downloads_https/S2B_MSIL1C_20251129T020939_N0511_R103_T52SCE_20251129T031932.zip
Temps d'exécution : 233.44927263259888 secondes

In comparison, the same downloading is complete in 3 minutes.

Please note that this performance gain is only valid if the data stay on CNES Information System. Access to the CNES S3 server is limited to the CNES network. Therefore, if you subsequently move the data to another infrastructure, you must also factor in the copy time, which is equivalent to downloading over HTTPS. So, if the data needs to be processed on an infrastructure other than the CNES information system, we recommend using HTTPS downloading.