fcollections.implementations

Contents

fcollections.implementations#

Module Attributes

AVISO_L2_LR_SSH_LAYOUT

Layout on Aviso FTP, Aviso TDS for the L2_LR_SSH product

AVISO_L3_LR_SSH_LAYOUT_V3

Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product

AVISO_L3_LR_SSH_LAYOUT_V2

Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product

AVISO_L3_LR_WINDWAVE_LAYOUT

Layout on Aviso FTP, Aviso TDS for the L3_LR_WindWave product

AVISO_L4_SWOT_LAYOUT

Layout on Aviso FTP, Aviso TDS for the L4 Sea Level Anomaly experimental product including karin measurements

CMEMS_SSHA_L3_LAYOUT

Layout on CMEMS for the Level 3 SSHA nadir products

CMEMS_L4_SSHA_LAYOUT

Layout on CMEMS for the Level 4 SSHA gridded products

CMEMS_OC_LAYOUT

Layout on CMEMS for the Level 3 and 4 ocean colour products

CMEMS_SWH_LAYOUT

Layout on CMEMS for the WAVE_GLO_PHY_SWH_L3_NRT_014_001 product

CMEMS_SST_LAYOUT

Layout on CMEMS for the SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010 product

Functions

build_version_parser()

Build file name convention to parse CRID versions.

Classes

BasicNetcdfFilesDatabaseDAC(path[, fs])

Database mapping to select and read Dynamic atmospheric correction Netcdf files in a local file system.

BasicNetcdfFilesDatabaseGriddedSLA(path[, ...])

Database mapping to select and read gridded Sla Netcdf files in a local file system.

BasicNetcdfFilesDatabaseSwotLRL2(path[, fs, ...])

Database mapping to select and read Swot LR L2 Netcdf files in a local file system.

BasicNetcdfFilesDatabaseL2Nadir(path[, fs, ...])

Database mapping to select and read L2 nadir Netcdf files in a local file system.

BasicNetcdfFilesDatabaseSwotLRL3(path[, fs, ...])

Database mapping to select and read Swot LR L3 Netcdf files in a local file system.

BasicNetcdfFilesDatabaseSwotLRWW(path[, fs, ...])

Database mapping to explore and read the L3_LR_WIND_WAVE product.

BasicNetcdfFilesDatabaseL3Nadir(path[, fs, ...])

Database mapping to select and read L3 nadir Netcdf files in a local file system.

BasicNetcdfFilesDatabaseMUR(path[, fs, ...])

Database mapping to select and read GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis product Netcdf file in a local file system.

BasicNetcdfFilesDatabaseOC(path[, fs, ...])

Database mapping to select and read ocean color Netcdf files in a local file system.

BasicNetcdfFilesDatabaseOHC(path[, fs, ...])

Database mapping to select and read ocean heat content Netcdf files in a local file system.

BasicNetcdfFilesDatabaseSST(path[, fs, ...])

Database mapping to select and read sea surface temperature Netcdf files in a local file system.

NetcdfFilesDatabaseSwotLRL2(path[, fs, ...])

NetcdfFilesDatabaseSwotLRL3(path[, fs, ...])

NetcdfFilesDatabaseGriddedSLA(path[, fs, ...])

NetcdfFilesDatabaseSST(path[, fs, ...])

NetcdfFilesDatabaseDAC(path[, fs])

NetcdfFilesDatabaseOC(path[, fs, ...])

NetcdfFilesDatabaseSWH(path[, fs, ...])

BasicNetcdfFilesDatabaseSWH(path[, fs, ...])

Database mapping to select and read significant wave height Netcdf files in a local file system.

NetcdfFilesDatabaseOHC(path[, fs, ...])

NetcdfFilesDatabaseS1AOWI(path[, fs, ...])

Database mapping to select and read S1A Ocean surface wind product Netcdf files in a local file system.

NetcdfFilesDatabaseMUR(path[, fs, ...])

NetcdfFilesDatabaseERA5(path[, fs, ...])

Database mapping to select and read ERA5 reanalysis product Netcdf files in a local file system.

NetcdfFilesDatabaseL2Nadir(path[, fs, ...])

NetcdfFilesDatabaseL3Nadir(path[, fs, ...])

NetcdfFilesDatabaseSwotLRWW(path[, fs, ...])

FileNameConventionERA5()

FileNameConventionOC()

Ocean Color datafiles parser.

FileNameConventionGriddedSLA()

Gridded SLA datafiles parser.

FileNameConventionGriddedSLAInternal()

FileNameConventionSST()

Sea Surface Temperature datafiles parser.

FileNameConventionDAC()

FileNameConventionSwotL2()

Swot LR L2 datafiles parser.

FileNameConventionSwotL3()

Swot LR L3 datafiles parser.

FileNameConventionSwotL3WW()

Swot L3_LR_WIND_WAVE product file names convention.

FileNameConventionOHC()

FileNameConventionS1AOWI()

FileNameConventionMUR()

FileNameConventionSWH()

FileNameConventionL2Nadir()

L2 Nadir datafiles parser.

FileNameConventionL3Nadir()

L3 Nadir datafiles parser.

SwotReaderL2LRSSH([xarray_options])

Reader for SWOT KaRIn L2_LR_SSH products.

SwotReaderL3LRSSH([xarray_options])

Reader for SWOT KaRIn L3_LR_SSH products.

SwotReaderL3WW([xarray_options])

Reader for the SWOT L3_LR_WIND_WAVE product.

Delay(*values)

Delay definition for L3 and L4 sea level products.

ProductLevel(*values)

Product level.

Origin(*values)

Dataset origin.

Group(*values)

Dataset group.

ProductClass(*values)

Dataset product class.

DataType(*values)

Dataset type.

Thematic(*values)

Dataset thematic.

Area(*values)

Dataset area of interest.

Variable(*values)

Dataset variable group.

Typology(*values)

Dataset typology.

Sensors(*values)

Aggregation of sensors for multiple CMEMS products.

Temporality(*values)

Temporality of the L3_LR_SSH product.

ProductSubset(*values)

Swot product subset enum.

SwotPhases(*values)

Swot mission phases definitions.

StackLevel(*values)

Stack level for swath half orbits on reference grid.

Timeliness(*values)

Timeliness of the SWOT L2_LR_SSH products.

L2Version([temporality, baseline, ...])

Represents a L2 Version of half orbits and enables version comparison.

L2VersionField(name[, ignore_product_counter])

AcquisitionMode(*values)

S1AOWIProductType(*values)

S1AOWISlicePostProcessing(*values)

fcollections.implementations.AVISO_L2_LR_SSH_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on Aviso FTP, Aviso TDS for the L2_LR_SSH product

fcollections.implementations.AVISO_L3_LR_SSH_LAYOUT_V2: Layout = <fcollections.core._listing.Layout object>#

Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product

fcollections.implementations.AVISO_L3_LR_SSH_LAYOUT_V3: Layout = <fcollections.core._listing.Layout object>#

Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product

fcollections.implementations.AVISO_L3_LR_WINDWAVE_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on Aviso FTP, Aviso TDS for the L3_LR_WindWave product

fcollections.implementations.AVISO_L4_SWOT_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on Aviso FTP, Aviso TDS for the L4 Sea Level Anomaly experimental product including karin measurements

class fcollections.implementations.AcquisitionMode(*values)[source]#

Bases: Enum

EW = 2#
IW = 1#
SM = 4#
WV = 3#
class fcollections.implementations.Area(*values)[source]#

Bases: Enum

Dataset area of interest.

ANT = 3#

Antarctic.

ARC = 2#

Arctic.

ATL = 1#

Atlantic.

BAL = 4#

Baltic.

BLK = 5#

Black sea.

EUR = 6#

Europe.

GLO = 7#

Global.

IBI = 8#

Iberian sea.

MED = 9#

Mediterranean.

NWS = 10#

North west shelf.

class fcollections.implementations.BasicNetcdfFilesDatabaseDAC(path: Path, fs: fsspec.AbstractFileSystem = fs_loc.LocalFileSystem())[source]#

Bases: FilesDatabase, DiscreteTimesMixin

Database mapping to select and read Dynamic atmospheric correction Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
metadata_injection: dict[str, tuple[str, ...]] | None = {'time': ('time',)}#

Configures how metadata from the files listing can be injected in a dataset returned from the read.

The keys is the columns of the file metadata table, the value is a tuple of dimensions for insertion.

query(*, selected_variables: list[str] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = ['time']#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseGriddedSLA(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read gridded Sla Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseL2Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read L2 nadir Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Raises:
query(*, selected_variables: list[str] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseL3Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read L3 nadir Netcdf files in a local file system.

deduplicator: Deduplicator | None = Deduplicator(unique=('time',), auto_pick_last=('production_date',))#

Deduplicate the file metadata table of a unique subset (after unmixing).

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'product_level': <Parameter "product_level: fcollections.implementations._definitions._constants.ProductLevel">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
query(*, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['sensor', 'resolution'], auto_pick_last=())#

Specify how to interpret the file metadata table to unmix subsets.

variables_info(*, sensor: Sensors, resolution: list[int] | slice | int)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseMUR(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis product Netcdf file in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseOC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read ocean color Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'area': <Parameter "area: fcollections.implementations._definitions._cmems.Area">, 'group': <Parameter "group: fcollections.implementations._definitions._cmems.Group">, 'level': <Parameter "level: str">, 'origin': <Parameter "origin: fcollections.implementations._definitions._cmems.Origin">, 'pc': <Parameter "pc: fcollections.implementations._definitions._cmems.ProductClass">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'spatial_resolution': <Parameter "spatial_resolution: str">, 'temporal_resolution': <Parameter "temporal_resolution: fcollections.time.ISODuration">, 'thematic': <Parameter "thematic: fcollections.implementations._definitions._cmems.Thematic">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'type': <Parameter "type: fcollections.implementations._definitions._cmems.DataType">, 'typology': <Parameter "typology: fcollections.implementations._definitions._cmems.Typology">, 'variable': <Parameter "variable: fcollections.implementations._definitions._cmems.Variable">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
query(*, selected_variables: list[str] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseOHC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read ocean heat content Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseSST(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read sea surface temperature Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseSWH(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read significant wave height Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, sensorf: Sensors, time: Period, production_date: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'production_date': <Parameter "production_date: numpy.datetime64">, 'sensorf': <Parameter "sensorf: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRL2(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read Swot LR L2 Netcdf files in a local file system.

deduplicator: Deduplicator | None = Deduplicator(unique=('cycle_number', 'pass_number'), auto_pick_last=('version',))#

Deduplicate the file metadata table of a unique subset (after unmixing).

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: L2Version)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: fcollections.implementations._l2_lr_ssh.L2Version">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Raises:
query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL2LRSSH object>#

Files reader.

reading_parameters = {'left_swath': <Parameter "left_swath: 'bool' = True">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'right_swath': <Parameter "right_swath: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'StackLevel | str' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['level', 'subset'], auto_pick_last=())#

Specify how to interpret the file metadata table to unmix subsets.

variables_info(*, level: ProductLevel, subset: ProductSubset)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRL3(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read Swot LR L3 Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: str)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False

  • swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False

  • swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL3LRSSH object>#

Files reader.

reading_parameters = {'nadir': <Parameter "nadir: 'bool' = False">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'str | StackLevel' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">, 'swath': <Parameter "swath: 'bool' = True">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['version', 'subset'], auto_pick_last=('version',))#

Specify how to interpret the file metadata table to unmix subsets.

variables_info(*, subset: ProductSubset, version: str)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRWW(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to explore and read the L3_LR_WIND_WAVE product.

See also

fcollections.implementations.AVISO_L3_LR_WINDWAVE_LAYOUT

Recommended layout for the database

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, subset: ProductSubset, version: str)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • tile – Tile size of the spectrum computation. Is mandatory for the Extended subset

  • box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
query(*, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • tile – Tile size of the spectrum computation. Is mandatory for the Extended subset

  • box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL3WW object>#

Files reader.

reading_parameters = {'box': <Parameter "box: 'int | None' = None">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'subset': <Parameter "subset: 'ProductSubset'">, 'tile': <Parameter "tile: 'int | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['version', 'subset'], auto_pick_last=('version',))#

Specify how to interpret the file metadata table to unmix subsets.

variables_info(*, subset: ProductSubset, version: str)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

fcollections.implementations.CMEMS_L4_SSHA_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on CMEMS for the Level 4 SSHA gridded products

fcollections.implementations.CMEMS_OC_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on CMEMS for the Level 3 and 4 ocean colour products

fcollections.implementations.CMEMS_SSHA_L3_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on CMEMS for the Level 3 SSHA nadir products

fcollections.implementations.CMEMS_SST_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on CMEMS for the SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010 product

fcollections.implementations.CMEMS_SWH_LAYOUT: Layout = <fcollections.core._listing.Layout object>#

Layout on CMEMS for the WAVE_GLO_PHY_SWH_L3_NRT_014_001 product

class fcollections.implementations.DataType(*values)[source]#

Bases: Enum

Dataset type.

ANFC = 4#

Analysis forecast.

HCST = 5#

Hindcast.

MY = 1#

Multi-Years consistent time series.

MYINT = 2#

Interim data (about 1 month after the acquisition date)

MYNRT = 6#

My NRT.

NRT = 3#

Near real time products.

class fcollections.implementations.Delay(*values)[source]#

Bases: Enum

Delay definition for L3 and L4 sea level products.

DT = 2#

Differed time.

NRT = 1#

Near real time.

class fcollections.implementations.FileNameConventionDAC[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionERA5[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionGriddedSLA[source]#

Bases: FileNameConvention

Gridded SLA datafiles parser.

class fcollections.implementations.FileNameConventionGriddedSLAInternal[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionL2Nadir[source]#

Bases: FileNameConvention

L2 Nadir datafiles parser.

class fcollections.implementations.FileNameConventionL3Nadir[source]#

Bases: FileNameConvention

L3 Nadir datafiles parser.

class fcollections.implementations.FileNameConventionMUR[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionOC[source]#

Bases: FileNameConvention

Ocean Color datafiles parser.

class fcollections.implementations.FileNameConventionOHC[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionS1AOWI[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionSST[source]#

Bases: FileNameConvention

Sea Surface Temperature datafiles parser.

class fcollections.implementations.FileNameConventionSWH[source]#

Bases: FileNameConvention

class fcollections.implementations.FileNameConventionSwotL2[source]#

Bases: FileNameConvention

Swot LR L2 datafiles parser.

class fcollections.implementations.FileNameConventionSwotL3[source]#

Bases: FileNameConvention

Swot LR L3 datafiles parser.

class fcollections.implementations.FileNameConventionSwotL3WW[source]#

Bases: FileNameConvention

Swot L3_LR_WIND_WAVE product file names convention.

class fcollections.implementations.Group(*values)[source]#

Bases: Enum

Dataset group.

MOD = 2#

Model.

OBS = 1#

Observations.

class fcollections.implementations.L2Version(temporality: Timeliness | None = None, baseline: str | None = None, minor_version: int | None = None, product_counter: int | None = None, ignore_product_counter_in_eq_check: bool = False)[source]#

Bases: object

Represents a L2 Version of half orbits and enables version comparison.

A L2 Version is parsed from a string in format <CRID_version>_<product counter>.

baseline: str | None = None#
static from_bytes(version: bytes, ignore_product_counter_in_eq_check: bool = False) L2Version | None[source]#

Build a L2Version from bytes.

Parameters:
  • version – The CRID version from which we build the L2Version object.

  • ignore_product_counter_in_eq_check – Set L2Version.product_counter to None, as we do not want to check it in the comparision operations.

Returns:

The L2Version object.

Note

Even when an AttributeError occures or input value is None, build a L2Version so that comparisons between np.array of L2Version do not fail with errors like: TypeError: ‘>’ not supported between instances of ‘NoneType’ and ‘NoneType’.

static from_bytes_array(versions: np_t.NDArray[bytes], ignore_product_counter_in_eq_check: bool = False) np_t.NDArray[object][source]#

Build a np.array of L2Version from an array of CRID versions as bytes.

Parameters:
  • versions – The array of CRID version from which we build the L2Version object.

  • ignore_product_counter_in_eq_check – Set each L2Version.product_counter to None, as we do not want to check it in the comparision operations.

Returns:

The array of L2Version objects.

static from_string(version: str, ignore_product_counter_in_eq_check: bool = False) L2Version | None[source]#

Build a L2Version from str.

Parameters:
  • version – The CRID version from which we build the L2Version object.

  • ignore_product_counter_in_eq_check – Set L2Version.product_counter to None, as we do not want to check it in the comparision operations.

Returns:

The L2Version object.

Note

Even when an AttributeError occures or input value is None, build a L2Version so that comparisons between np.array of L2Version do not fail with errors like: TypeError: ‘>’ not supported between instances of ‘NoneType’ and ‘NoneType’.

static from_string_array(versions: np_t.NDArray[str], ignore_product_counter_in_eq_check: bool = False) np_t.NDArray[object][source]#

Build a np.array of L2Version from an array of CRID versions as str.

Parameters:
  • versions – The array of CRID version from which we build the L2Version object.

  • ignore_product_counter_in_eq_check – Set each L2Version.product_counter to None, as we do not want to check it in the comparision operations.

Returns:

The array of L2Version objects.

ignore_product_counter_in_eq_check: bool = False#
property is_null#

True if all attrs but ‘ignore_product_counter_in_eq_check’ are None.

minor_version: int | None = None#
product_counter: int | None = None#
temporality: Timeliness | None = None#
class fcollections.implementations.L2VersionField(name: str, ignore_product_counter: bool = False)[source]#

Bases: FileNameField

decode(input_string: str) L2Version[source]#

Decode an input string and generate a Generic[T] object.

Parameters:

input_string – The input string

Returns:

The decoded Generic[T] object

Raises:

DecodingError – If the input string decoding fails

encode(data: L2Version) str[source]#

Encode a Generic[T] object into a string.

Parameters:

data – The input Generic[T] object

Returns:

The encoded string

sanitize(reference: str | L2Version) L2Version[source]#

Cast to one of the types handled by this tester.

Parameters:

reference – The reference object to cast

Returns:

The input cast to the proper type

test(reference: L2Version, tested: L2Version) bool[source]#

Compare two objects of similar types.

Parameters:
  • reference – The reference object

  • tested – The tested object

Returns:

True if the test is successful, False otherwise

property test_description: str#

User-friendly description of the possible types for the reference.

property type: type[L2Version]#

Type of the tested field.

class fcollections.implementations.NetcdfFilesDatabaseDAC(path: Path, fs: fsspec.AbstractFileSystem = fs_loc.LocalFileSystem())[source]#

Bases: BasicNetcdfFilesDatabaseDAC

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseERA5(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read ERA5 reanalysis product Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseGriddedSLA(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseGriddedSLA

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseL2Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseL2Nadir

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseL3Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseL3Nadir

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'product_level': <Parameter "product_level: fcollections.implementations._definitions._constants.ProductLevel">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info(*, sensor: Sensors, resolution: list[int] | slice | int)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseMUR(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseMUR

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseOC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseOC

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'area': <Parameter "area: fcollections.implementations._definitions._cmems.Area">, 'group': <Parameter "group: fcollections.implementations._definitions._cmems.Group">, 'level': <Parameter "level: str">, 'origin': <Parameter "origin: fcollections.implementations._definitions._cmems.Origin">, 'pc': <Parameter "pc: fcollections.implementations._definitions._cmems.ProductClass">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'spatial_resolution': <Parameter "spatial_resolution: str">, 'temporal_resolution': <Parameter "temporal_resolution: fcollections.time.ISODuration">, 'thematic': <Parameter "thematic: fcollections.implementations._definitions._cmems.Thematic">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'type': <Parameter "type: fcollections.implementations._definitions._cmems.DataType">, 'typology': <Parameter "typology: fcollections.implementations._definitions._cmems.Typology">, 'variable': <Parameter "variable: fcollections.implementations._definitions._cmems.Variable">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]

  • group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]

  • pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]

  • area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]

  • thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]

  • variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]

  • type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]

  • level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

  • temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)

  • typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]

  • version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseOHC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseOHC

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseS1AOWI(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: FilesDatabase, PeriodMixin

Database mapping to select and read S1A Ocean surface wind product Netcdf files in a local file system.

layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#

Semantic describing how the files are organized.

Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set enable_layouts to False.

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]

  • slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'acquisition_mode': <Parameter "acquisition_mode: fcollections.implementations._s1aowi.AcquisitionMode">, 'orbit': <Parameter "orbit: list[int] | slice | int">, 'product_type': <Parameter "product_type: fcollections.implementations._s1aowi.S1AOWIProductType">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'slice_post_processing': <Parameter "slice_post_processing: fcollections.implementations._s1aowi.S1AOWISlicePostProcessing">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]

  • slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]

Raises:
query(*, selected_variables: list[str] | None = None, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]

  • slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#

Files reader.

reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
sort_keys: list[str] | str | None = 'time'#

Keys that specifies the fields used to sort the records extracted from the filenames.

Useful to order the files prior to reading them.

variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseSST(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseSST

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseSWH(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseSWH

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, sensorf: Sensors, time: Period, production_date: datetime64)#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'production_date': <Parameter "production_date: numpy.datetime64">, 'sensorf': <Parameter "sensorf: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Raises:
query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – Variables that needs to be read. Set to None to read everything

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
variables_info()#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseSwotLRL2(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseSwotLRL2

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: L2Version, bbox: tuple[float, float, float, float])#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: fcollections.implementations._l2_lr_ssh.L2Version">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Raises:
predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#

List of predicates that are built at each query.

The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.

query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL2LRSSH object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'left_swath': <Parameter "left_swath: 'bool' = True">, 'right_swath': <Parameter "right_swath: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'StackLevel | str' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">}#
variables_info(*, level: ProductLevel, subset: ProductSubset)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseSwotLRL3(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseSwotLRL3

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: str, bbox: tuple[float, float, float, float])#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False

  • swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#

List of predicates that are built at each query.

The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.

query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False

  • swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL3LRSSH object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'nadir': <Parameter "nadir: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'str | StackLevel' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">, 'swath': <Parameter "swath: 'bool' = True">}#
variables_info(*, subset: ProductSubset, version: str)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.NetcdfFilesDatabaseSwotLRWW(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#

Bases: BasicNetcdfFilesDatabaseSwotLRWW

list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, subset: ProductSubset, version: str, bbox: tuple[float, float, float, float])#

List the files matching the given criteria.

Parameters:
  • sort – Sort the results using the sort_keys attribute if this class

  • deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur

  • unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised

  • predicates – Additional complex filters to run on the record parsed by the filename. ex. lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name convention

  • stat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table

  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#

Map a function over dataset extracted from the files.

Parameters:
  • func – Callable that works on a xarray dataset.

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • tile – Tile size of the spectrum computation. Is mandatory for the Extended subset

  • box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#

List of predicates that are built at each query.

The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.

query(*, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#

Query a dataset by reading selected files in file system.

Parameters:
  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • tile – Tile size of the spectrum computation. Is mandatory for the Extended subset

  • box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension

  • cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.

  • time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Returns:

  • A dataset containing the result of the query, or an None if there is

  • nothing matching the query

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL3WW object>#

Files reader.

reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'box': <Parameter "box: 'int | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'subset': <Parameter "subset: 'ProductSubset'">, 'tile': <Parameter "tile: 'int | None' = None">}#
variables_info(*, subset: ProductSubset, version: str)#

Returns the variables metadata.

Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError

Parameters:
  • subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]

  • version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.

Raises:
  • LayoutMismatchError – In case enable_layouts is True and a mismatch between the layouts and the actual files is detected

  • ValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table

class fcollections.implementations.Origin(*values)[source]#

Bases: Enum

Dataset origin.

C3S = 2#

C3S.

CCI = 3#
CMEMS = 1#

Copernicus Marine.

OSISAF = 4#

OSISAF.

class fcollections.implementations.ProductClass(*values)[source]#

Bases: Enum

Dataset product class.

INS = 8#

In-situ.

MOB = 7#

Multi observations.

OC = 3#

Ocean Colour Thematic Assembly Center.

SI = 4#

Sea Ice.

SL = 2#

Sea Level Thematic Assembly Center.

SST = 1#

Sea Surface Temperature Thematic Assembly Center.

WAVE = 6#

Wave.

WIND = 5#

Wind.

class fcollections.implementations.ProductLevel(*values)[source]#

Bases: Enum

Product level.

L2 = 1#

Level-2 products.

L3 = 2#

Level-3 products.

L4 = 3#

Level-4 products.

class fcollections.implementations.ProductSubset(*values)[source]#

Bases: Enum

Swot product subset enum.

Basic = 1#

Basic subset for L2_LR_SSH and L3_LR_SSH products.

Expert = 2#

Expert subset for L2_LR_SSH and L3_LR_SSH products.

Expert subset contains all of the Basic subset fields.

Extended = 7#

Extended subset for L3_LR_WIND_WAVE product.

Extended subset contains all of the Light subset data.

Light = 6#

Light subset for L3_LR_WIND_WAVE product.

Technical = 5#

Technical subset for L3_LR_SSH product.

Contains additional fields such as alternative corrections to be used by experts.

Unsmoothed = 4#

Expert subset for L2_LR_SSH and L3_LR_SSH products.

WindWave = 3#

Basic subset for L2_LR_SSH product.

class fcollections.implementations.S1AOWIProductType(*values)[source]#

Bases: Enum

GS = 2#
SW = 1#
class fcollections.implementations.S1AOWISlicePostProcessing(*values)[source]#

Bases: Enum

CC = 1#
CM = 2#
OCN = 3#
class fcollections.implementations.Sensors(*values)[source]#

Bases: Enum

Aggregation of sensors for multiple CMEMS products.

  • SEALEVEL_GLO_PHY_L3_MY_008_062

  • SEALEVEL_GLO_PHY_L3_NRT_008_044

  • SEALEVEL_GLO_PHY_L4_NRT_008_046

  • SEALEVEL_GLO_PHY_L4_MY_008_047

  • WAVE_GLO_PHY_SWH_L3_NRT_014_001

  • SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010

  • OCEANCOLOUR_GLO_BGC_L3_MY_009_103

AL = 21#
ALG = 22#
ALLSAT = 32#
ALLSAT_SWOS = 34#
C2 = 1#
C2N = 2#
CFO = 35#
DEMO_ALLSAT_SWOTS = 33#
E1 = 5#
E1G = 6#
E2 = 7#
EN = 3#
ENN = 4#
G2 = 8#
GIR = 38#
H2A = 9#
H2AG = 10#
H2B = 11#
H2C = 36#
J1 = 12#
J1G = 13#
J1N = 14#
J2 = 15#
J2G = 17#
J2N = 16#
J3 = 18#
J3G = 20#
J3N = 19#
MULTI = 42#
OLCI = 41#
PIR = 39#
PMW = 40#
S3A = 23#
S3B = 24#
S6A = 25#
S6A_HR = 27#
S6A_LR = 26#
SWON = 28#
SWONC = 29#
SWOT = 37#
TP = 30#
TPN = 31#
class fcollections.implementations.StackLevel(*values)[source]#

Bases: Enum

Stack level for swath half orbits on reference grid.

Swath half orbits on a reference grid are by definition sampled at the same location for each cycle. This means we can split the temporal dimension num_lines into one or two other dimensions cycle_number and pass_number

CYCLES = 2#
CYCLES_PASSES = 3#
NOSTACK = 1#

No stack, dataset will be returned as (num_lines, num_pixels)

class fcollections.implementations.SwotPhases(*values)[source]#

Bases: Enum

Swot mission phases definitions.

CALVAL = 1#

1-day repeat orbit, sparse geographical coverage.

SCIENCE = 2#

21-day repeat orbit, quasi full geographical coverage.

class fcollections.implementations.SwotReaderL2LRSSH(xarray_options: dict[str, str] | None = None)[source]#

Bases: OpenMfDataset

Reader for SWOT KaRIn L2_LR_SSH products.

expected_coords: set[str] = {'latitude', 'longitude', 'time'}#

Variables we want set as coordinates in the output dataset

read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#

Read a dataset from L2_LR_SSH products.

Parameters:
  • files – list of files to open. At least one file should be given

  • fs – File systems hosting the files

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Product dataset (Basic, Expert, WindWave or Unsmoothed)

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

Raises:
  • ValueError – If the input list of files is empty

  • ValueError – If the input stack parameter is not matching a valid StackLevel

Return type:

An xarray dataset containing the dataset from the input files

class fcollections.implementations.SwotReaderL3LRSSH(xarray_options: dict[str, str] | None = None)[source]#

Bases: OpenMfDataset

Reader for SWOT KaRIn L3_LR_SSH products.

clipped_ssha: set[str] = {'ssha_filtered', 'ssha_noiseless', 'ssha_unedited', 'ssha_unfiltered'}#

SSHA Variables with nadir data clipped in it. Should cover both present and older versions of the product

expected_coords: set[str] = {'latitude', 'longitude', 'time'}#

Variables we want set as coordinates in the output dataset

read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#

Read a dataset from L2_LR_SSH products.

Parameters:
  • files – list of files to open. At least one file should be given

  • fs – File systems hosting the files

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Product dataset (Basic, Expert, Technical or Unsmoothed)

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK

  • nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False

  • swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True

Raises:
  • ValueError – If stack=CYCLES_PASSES or stack=CYCLES, swath=False and nadir=True. In this case, we are trying to stack nadir data which is not guaranteed to have the same number of points per half orbit. This is not supported case

  • ValueError – If swath=False and nadir=False. In this case, the user is asking for an empty return

  • ValueError – If the input list of files is empty

  • ValueError – If the input stack parameter is not matching a valid StackLevel

Returns:

An xarray dataset containing the dataset from the input files

class fcollections.implementations.SwotReaderL3WW(xarray_options: dict[str, str] | None = None)[source]#

Bases: OpenMfDataset

Reader for the SWOT L3_LR_WIND_WAVE product.

This reader handles both the Light and Extended subsets. The Light subset is simpler and has spectral content along the n_box dimension with a default tile and box sizes. The Extended subset has multiple tile and box sizes stored in matching netcdf groups. Thus, tile and box size should generally be given for the Extended subset (see the read method for more details).

The L3_LR_WIND_WAVE product is built from the L3_LR_SSH product, and references ‘num_lines’ indices.

See also

SwotReaderL3LRSSH

the L3_LR_SSH product reader

read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), tile: int | None = None, box: int | None = None, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#

Read a SWOT dataset from LR_SSH products.

Parameters:
  • files – list of files to open. At least one file should be given. If multiple files are given, variables following the n_box dimension will be concatenated. The others variables are constant and will not be repeated

  • fs – File systems hosting the files

  • selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection

  • subset – Product dataset (Light, Extended)

  • bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)

  • tile – Tile size of the spectrum computation. Is mandatory for the Extended subset

  • box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension

Raises:
  • ValueError – If tile or box argument are given when reading a Light subset

  • ValueError – If the list of files is empty

  • ValueError – If the tile and box argument is missing for the Extended subset

  • ValueError – If the input subset does not match neither Light nor Extended

  • ValueError – If the input tile or box size is not found in the files

Return type:

An xarray dataset containing the dataset from the input files

class fcollections.implementations.Temporality(*values)[source]#

Bases: Enum

Temporality of the L3_LR_SSH product.

The L3_LR_SSH product is calibrated on nadir data. Nadir data has two temporalities in Copernicus Marine: reprocessed data (labelled as Multi-Year MY) and near-real time data (NRT).

This temporality in upstream data is reflected as reproc/forward to adopt the SWOT mission denomination. It is not the same definition as the L2_LR_SSH, where reprocess data covers PGC, PGD, … datasets and forward data covers PIC, PID, …

See also

fcollections.implementations.DataType

Copernicus Marine data type definition (in our case for Nadir data)

fcollections.implementations.Timeliness

L2_LR_SSH product temporality definition

FORWARD = 2#

Forward data calibrated on the NRT nadir dataset.

REPROC = 1#

Reprocessed data calibrated on the MY nadir dataset.

class fcollections.implementations.Thematic(*values)[source]#

Bases: Enum

Dataset thematic.

BGC = 2#

Biogeochemical.

PHY = 1#

Physical.

PHYBGC = 4#

Phy BGC.

PHYBGCWAV = 5#

Wav Phy BGC.

WAV = 3#

Wav.

class fcollections.implementations.Timeliness(*values)[source]#

Bases: Enum

Timeliness of the SWOT L2_LR_SSH products.

G = 2#

Reprocessed data.

I = 1#

Forward data.

class fcollections.implementations.Typology(*values)[source]#

Bases: Enum

Dataset typology.

I = 1#

Instantaneous.

M = 2#

Mean.

class fcollections.implementations.Variable(*values)[source]#

Bases: Enum

Dataset variable group.

CAR = 4#

Carbon.

CHL = 3#

Chlorophyll.

CUR = 2#

Currents.

GEOPHY = 6#

Geophy.

HFLUX = 13#

Heat flux.

MFLUX = 11#

Momentum flux.

NUT = 5#

Nutrient.

OPTICS = 9#

Optics.

PLANKTON = 7#

Plancton.

PP = 10#

Primary production.

REFLECTANCE = 16#

Reflectance.

SSH = 15#

Sea surface Height.

SWH = 14#

Surface Wave Height.

TEMP = 1#

Temperature.

TRANSP = 8#

Transparency.

WFLUX = 12#

Water flux.

fcollections.implementations.build_version_parser() FileNameConvention[source]#

Build file name convention to parse CRID versions.