fcollections.implementations#
Module Attributes
Layout on Aviso FTP, Aviso TDS for the L2_LR_SSH product |
|
Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product |
|
Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product |
|
Layout on Aviso FTP, Aviso TDS for the L3_LR_WindWave product |
|
Layout on Aviso FTP, Aviso TDS for the L4 Sea Level Anomaly experimental product including karin measurements |
|
Layout on CMEMS for the Level 3 SSHA nadir products |
|
Layout on CMEMS for the Level 4 SSHA gridded products |
|
Layout on CMEMS for the Level 3 and 4 ocean colour products |
|
Layout on CMEMS for the WAVE_GLO_PHY_SWH_L3_NRT_014_001 product |
|
Layout on CMEMS for the SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010 product |
Functions
Build file name convention to parse CRID versions. |
Classes
|
Database mapping to select and read Dynamic atmospheric correction Netcdf files in a local file system. |
|
Database mapping to select and read gridded Sla Netcdf files in a local file system. |
|
Database mapping to select and read Swot LR L2 Netcdf files in a local file system. |
|
Database mapping to select and read L2 nadir Netcdf files in a local file system. |
|
Database mapping to select and read Swot LR L3 Netcdf files in a local file system. |
|
Database mapping to explore and read the L3_LR_WIND_WAVE product. |
|
Database mapping to select and read L3 nadir Netcdf files in a local file system. |
|
Database mapping to select and read GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis product Netcdf file in a local file system. |
|
Database mapping to select and read ocean color Netcdf files in a local file system. |
|
Database mapping to select and read ocean heat content Netcdf files in a local file system. |
|
Database mapping to select and read sea surface temperature Netcdf files in a local file system. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Database mapping to select and read significant wave height Netcdf files in a local file system. |
|
|
|
Database mapping to select and read S1A Ocean surface wind product Netcdf files in a local file system. |
|
|
|
Database mapping to select and read ERA5 reanalysis product Netcdf files in a local file system. |
|
|
|
|
|
|
Ocean Color datafiles parser. |
|
Gridded SLA datafiles parser. |
|
Sea Surface Temperature datafiles parser. |
|
Swot LR L2 datafiles parser. |
|
Swot LR L3 datafiles parser. |
|
Swot L3_LR_WIND_WAVE product file names convention. |
|
L2 Nadir datafiles parser. |
|
L3 Nadir datafiles parser. |
|
|
Reader for SWOT KaRIn L2_LR_SSH products. |
|
Reader for SWOT KaRIn L3_LR_SSH products. |
|
Reader for the SWOT L3_LR_WIND_WAVE product. |
|
Delay definition for L3 and L4 sea level products. |
|
Product level. |
|
Dataset origin. |
|
Dataset group. |
|
Dataset product class. |
|
Dataset type. |
|
Dataset thematic. |
|
Dataset area of interest. |
|
Dataset variable group. |
|
Dataset typology. |
|
Aggregation of sensors for multiple CMEMS products. |
|
Temporality of the L3_LR_SSH product. |
|
Swot product subset enum. |
|
Swot mission phases definitions. |
|
Stack level for swath half orbits on reference grid. |
|
Timeliness of the SWOT L2_LR_SSH products. |
|
Represents a L2 Version of half orbits and enables version comparison. |
|
|
|
|
|
|
|
- fcollections.implementations.AVISO_L2_LR_SSH_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on Aviso FTP, Aviso TDS for the L2_LR_SSH product
- fcollections.implementations.AVISO_L3_LR_SSH_LAYOUT_V2: Layout = <fcollections.core._listing.Layout object>#
Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product
- fcollections.implementations.AVISO_L3_LR_SSH_LAYOUT_V3: Layout = <fcollections.core._listing.Layout object>#
Layout on Aviso FTP, Aviso TDS for the L3_LR_SSH product
- fcollections.implementations.AVISO_L3_LR_WINDWAVE_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on Aviso FTP, Aviso TDS for the L3_LR_WindWave product
- fcollections.implementations.AVISO_L4_SWOT_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on Aviso FTP, Aviso TDS for the L4 Sea Level Anomaly experimental product including karin measurements
- class fcollections.implementations.AcquisitionMode(*values)[source]#
Bases:
Enum- EW = 2#
- IW = 1#
- SM = 4#
- WV = 3#
- class fcollections.implementations.Area(*values)[source]#
Bases:
EnumDataset area of interest.
- ANT = 3#
Antarctic.
- ARC = 2#
Arctic.
- ATL = 1#
Atlantic.
- BAL = 4#
Baltic.
- BLK = 5#
Black sea.
- EUR = 6#
Europe.
- GLO = 7#
Global.
- IBI = 8#
Iberian sea.
- MED = 9#
Mediterranean.
- NWS = 10#
North west shelf.
- class fcollections.implementations.BasicNetcdfFilesDatabaseDAC(path: Path, fs: fsspec.AbstractFileSystem = fs_loc.LocalFileSystem())[source]#
Bases:
FilesDatabase,DiscreteTimesMixinDatabase mapping to select and read Dynamic atmospheric correction Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- metadata_injection: dict[str, tuple[str, ...]] | None = {'time': ('time',)}#
Configures how metadata from the files listing can be injected in a dataset returned from the read.
The keys is the columns of the file metadata table, the value is a tuple of dimensions for insertion.
- query(*, selected_variables: list[str] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = ['time']#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseGriddedSLA(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read gridded Sla Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseL2Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read L2 nadir Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseL3Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read L3 nadir Netcdf files in a local file system.
- deduplicator: Deduplicator | None = Deduplicator(unique=('time',), auto_pick_last=('production_date',))#
Deduplicate the file metadata table of a unique subset (after unmixing).
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'product_level': <Parameter "product_level: fcollections.implementations._definitions._constants.ProductLevel">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['sensor', 'resolution'], auto_pick_last=())#
Specify how to interpret the file metadata table to unmix subsets.
- variables_info(*, sensor: Sensors, resolution: list[int] | slice | int)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseMUR(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis product Netcdf file in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseOC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read ocean color Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'area': <Parameter "area: fcollections.implementations._definitions._cmems.Area">, 'group': <Parameter "group: fcollections.implementations._definitions._cmems.Group">, 'level': <Parameter "level: str">, 'origin': <Parameter "origin: fcollections.implementations._definitions._cmems.Origin">, 'pc': <Parameter "pc: fcollections.implementations._definitions._cmems.ProductClass">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'spatial_resolution': <Parameter "spatial_resolution: str">, 'temporal_resolution': <Parameter "temporal_resolution: fcollections.time.ISODuration">, 'thematic': <Parameter "thematic: fcollections.implementations._definitions._cmems.Thematic">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'type': <Parameter "type: fcollections.implementations._definitions._cmems.DataType">, 'typology': <Parameter "typology: fcollections.implementations._definitions._cmems.Typology">, 'variable': <Parameter "variable: fcollections.implementations._definitions._cmems.Variable">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseOHC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read ocean heat content Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseSST(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read sea surface temperature Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseSWH(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read significant wave height Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, sensorf: Sensors, time: Period, production_date: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'production_date': <Parameter "production_date: numpy.datetime64">, 'sensorf': <Parameter "sensorf: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRL2(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read Swot LR L2 Netcdf files in a local file system.
- deduplicator: Deduplicator | None = Deduplicator(unique=('cycle_number', 'pass_number'), auto_pick_last=('version',))#
Deduplicate the file metadata table of a unique subset (after unmixing).
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: L2Version)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: fcollections.implementations._l2_lr_ssh.L2Version">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL2LRSSH object>#
Files reader.
- reading_parameters = {'left_swath': <Parameter "left_swath: 'bool' = True">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'right_swath': <Parameter "right_swath: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'StackLevel | str' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['level', 'subset'], auto_pick_last=())#
Specify how to interpret the file metadata table to unmix subsets.
- variables_info(*, level: ProductLevel, subset: ProductSubset)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRL3(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read Swot LR L3 Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: str)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False
swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False
swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL3LRSSH object>#
Files reader.
- reading_parameters = {'nadir': <Parameter "nadir: 'bool' = False">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'str | StackLevel' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">, 'swath': <Parameter "swath: 'bool' = True">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['version', 'subset'], auto_pick_last=('version',))#
Specify how to interpret the file metadata table to unmix subsets.
- variables_info(*, subset: ProductSubset, version: str)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.BasicNetcdfFilesDatabaseSwotLRWW(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to explore and read the L3_LR_WIND_WAVE product.
See also
fcollections.implementations.AVISO_L3_LR_WINDWAVE_LAYOUTRecommended layout for the database
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>, <fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, subset: ProductSubset, version: str)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
tile – Tile size of the spectrum computation. Is mandatory for the Extended subset
box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, preprocessor: tp.Callable[[xr.Dataset], xr.Dataset] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
tile – Tile size of the spectrum computation. Is mandatory for the Extended subset
box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations._readers.SwotReaderL3WW object>#
Files reader.
- reading_parameters = {'box': <Parameter "box: 'int | None' = None">, 'preprocessor': <Parameter "preprocessor: 'tp.Callable[[xr.Dataset], xr.Dataset] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'subset': <Parameter "subset: 'ProductSubset'">, 'tile': <Parameter "tile: 'int | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- unmixer: SubsetsUnmixer | None = SubsetsUnmixer(partition_keys=['version', 'subset'], auto_pick_last=('version',))#
Specify how to interpret the file metadata table to unmix subsets.
- variables_info(*, subset: ProductSubset, version: str)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- fcollections.implementations.CMEMS_L4_SSHA_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on CMEMS for the Level 4 SSHA gridded products
- fcollections.implementations.CMEMS_OC_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on CMEMS for the Level 3 and 4 ocean colour products
- fcollections.implementations.CMEMS_SSHA_L3_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on CMEMS for the Level 3 SSHA nadir products
- fcollections.implementations.CMEMS_SST_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on CMEMS for the SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010 product
- fcollections.implementations.CMEMS_SWH_LAYOUT: Layout = <fcollections.core._listing.Layout object>#
Layout on CMEMS for the WAVE_GLO_PHY_SWH_L3_NRT_014_001 product
- class fcollections.implementations.DataType(*values)[source]#
Bases:
EnumDataset type.
- ANFC = 4#
Analysis forecast.
- HCST = 5#
Hindcast.
- MY = 1#
Multi-Years consistent time series.
- MYINT = 2#
Interim data (about 1 month after the acquisition date)
- MYNRT = 6#
My NRT.
- NRT = 3#
Near real time products.
- class fcollections.implementations.Delay(*values)[source]#
Bases:
EnumDelay definition for L3 and L4 sea level products.
- DT = 2#
Differed time.
- NRT = 1#
Near real time.
- class fcollections.implementations.FileNameConventionDAC[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionERA5[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionGriddedSLA[source]#
Bases:
FileNameConventionGridded SLA datafiles parser.
- class fcollections.implementations.FileNameConventionGriddedSLAInternal[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionL2Nadir[source]#
Bases:
FileNameConventionL2 Nadir datafiles parser.
- class fcollections.implementations.FileNameConventionL3Nadir[source]#
Bases:
FileNameConventionL3 Nadir datafiles parser.
- class fcollections.implementations.FileNameConventionMUR[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionOC[source]#
Bases:
FileNameConventionOcean Color datafiles parser.
- class fcollections.implementations.FileNameConventionOHC[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionS1AOWI[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionSST[source]#
Bases:
FileNameConventionSea Surface Temperature datafiles parser.
- class fcollections.implementations.FileNameConventionSWH[source]#
Bases:
FileNameConvention
- class fcollections.implementations.FileNameConventionSwotL2[source]#
Bases:
FileNameConventionSwot LR L2 datafiles parser.
- class fcollections.implementations.FileNameConventionSwotL3[source]#
Bases:
FileNameConventionSwot LR L3 datafiles parser.
- class fcollections.implementations.FileNameConventionSwotL3WW[source]#
Bases:
FileNameConventionSwot L3_LR_WIND_WAVE product file names convention.
- class fcollections.implementations.Group(*values)[source]#
Bases:
EnumDataset group.
- MOD = 2#
Model.
- OBS = 1#
Observations.
- class fcollections.implementations.L2Version(temporality: Timeliness | None = None, baseline: str | None = None, minor_version: int | None = None, product_counter: int | None = None, ignore_product_counter_in_eq_check: bool = False)[source]#
Bases:
objectRepresents a L2 Version of half orbits and enables version comparison.
A L2 Version is parsed from a string in format <CRID_version>_<product counter>.
- static from_bytes(version: bytes, ignore_product_counter_in_eq_check: bool = False) L2Version | None[source]#
Build a L2Version from bytes.
- Parameters:
version – The CRID version from which we build the L2Version object.
ignore_product_counter_in_eq_check – Set L2Version.product_counter to None, as we do not want to check it in the comparision operations.
- Returns:
The L2Version object.
Note
Even when an AttributeError occures or input value is None, build a L2Version so that comparisons between np.array of L2Version do not fail with errors like: TypeError: ‘>’ not supported between instances of ‘NoneType’ and ‘NoneType’.
- static from_bytes_array(versions: np_t.NDArray[bytes], ignore_product_counter_in_eq_check: bool = False) np_t.NDArray[object][source]#
Build a np.array of L2Version from an array of CRID versions as bytes.
- Parameters:
versions – The array of CRID version from which we build the L2Version object.
ignore_product_counter_in_eq_check – Set each L2Version.product_counter to None, as we do not want to check it in the comparision operations.
- Returns:
The array of L2Version objects.
- static from_string(version: str, ignore_product_counter_in_eq_check: bool = False) L2Version | None[source]#
Build a L2Version from str.
- Parameters:
version – The CRID version from which we build the L2Version object.
ignore_product_counter_in_eq_check – Set L2Version.product_counter to None, as we do not want to check it in the comparision operations.
- Returns:
The L2Version object.
Note
Even when an AttributeError occures or input value is None, build a L2Version so that comparisons between np.array of L2Version do not fail with errors like: TypeError: ‘>’ not supported between instances of ‘NoneType’ and ‘NoneType’.
- static from_string_array(versions: np_t.NDArray[str], ignore_product_counter_in_eq_check: bool = False) np_t.NDArray[object][source]#
Build a np.array of L2Version from an array of CRID versions as str.
- Parameters:
versions – The array of CRID version from which we build the L2Version object.
ignore_product_counter_in_eq_check – Set each L2Version.product_counter to None, as we do not want to check it in the comparision operations.
- Returns:
The array of L2Version objects.
- property is_null#
True if all attrs but ‘ignore_product_counter_in_eq_check’ are None.
- temporality: Timeliness | None = None#
- class fcollections.implementations.L2VersionField(name: str, ignore_product_counter: bool = False)[source]#
Bases:
FileNameField- decode(input_string: str) L2Version[source]#
Decode an input string and generate a Generic[T] object.
- Parameters:
input_string – The input string
- Returns:
The decoded Generic[T] object
- Raises:
DecodingError – If the input string decoding fails
- encode(data: L2Version) str[source]#
Encode a Generic[T] object into a string.
- Parameters:
data – The input Generic[T] object
- Returns:
The encoded string
- sanitize(reference: str | L2Version) L2Version[source]#
Cast to one of the types handled by this tester.
- Parameters:
reference – The reference object to cast
- Returns:
The input cast to the proper type
- class fcollections.implementations.NetcdfFilesDatabaseDAC(path: Path, fs: fsspec.AbstractFileSystem = fs_loc.LocalFileSystem())[source]#
Bases:
BasicNetcdfFilesDatabaseDAC- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseERA5(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read ERA5 reanalysis product Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseGriddedSLA(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseGriddedSLA- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseL2Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseL2Nadir- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseL3Nadir(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseL3Nadir- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'delay': <Parameter "delay: fcollections.implementations._definitions._constants.Delay">, 'product_level': <Parameter "product_level: fcollections.implementations._definitions._constants.ProductLevel">, 'production_date': <Parameter "production_date: numpy.datetime64">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, delay: Delay, time: Period, production_date: datetime64, sensor: Sensors, product_level: ProductLevel, resolution: list[int] | slice | int)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
delay – Delay. As an Enum field, it can be filtered using a reference <enum ‘Delay’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘NRT’, ‘DT’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
product_level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info(*, sensor: Sensors, resolution: list[int] | slice | int)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
resolution – Data resolution. Nadir products may be sampled at 1Hz, 5Hz or 20Hz depending on the level and dataset considered. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseMUR(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseMUR- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseOC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseOC- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'area': <Parameter "area: fcollections.implementations._definitions._cmems.Area">, 'group': <Parameter "group: fcollections.implementations._definitions._cmems.Group">, 'level': <Parameter "level: str">, 'origin': <Parameter "origin: fcollections.implementations._definitions._cmems.Origin">, 'pc': <Parameter "pc: fcollections.implementations._definitions._cmems.ProductClass">, 'sensor': <Parameter "sensor: fcollections.implementations._definitions._cmems.Sensors">, 'spatial_resolution': <Parameter "spatial_resolution: str">, 'temporal_resolution': <Parameter "temporal_resolution: fcollections.time.ISODuration">, 'thematic': <Parameter "thematic: fcollections.implementations._definitions._cmems.Thematic">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'type': <Parameter "type: fcollections.implementations._definitions._cmems.DataType">, 'typology': <Parameter "typology: fcollections.implementations._definitions._cmems.Typology">, 'variable': <Parameter "variable: fcollections.implementations._definitions._cmems.Variable">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: Period, origin: Origin, group: Group, pc: ProductClass, area: Area, thematic: Thematic, variable: Variable, type: DataType, level: str, sensor: Sensors, spatial_resolution: str, temporal_resolution: ISODuration, typology: Typology, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
origin – As an Enum field, it can be filtered using a reference <enum ‘Origin’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CMEMS’, ‘C3S’, ‘CCI’, ‘OSISAF’]
group – As an Enum field, it can be filtered using a reference <enum ‘Group’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘OBS’, ‘MOD’]
pc – As an Enum field, it can be filtered using a reference <enum ‘ProductClass’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SST’, ‘SL’, ‘OC’, ‘SI’, ‘WIND’, ‘WAVE’, ‘MOB’, ‘INS’]
area – As an Enum field, it can be filtered using a reference <enum ‘Area’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘ATL’, ‘ARC’, ‘ANT’, ‘BAL’, ‘BLK’, ‘EUR’, ‘GLO’, ‘IBI’, ‘MED’, ‘NWS’]
thematic – As an Enum field, it can be filtered using a reference <enum ‘Thematic’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘PHY’, ‘BGC’, ‘WAV’, ‘PHYBGC’, ‘PHYBGCWAV’]
variable – As an Enum field, it can be filtered using a reference <enum ‘Variable’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘TEMP’, ‘CUR’, ‘CHL’, ‘CAR’, ‘NUT’, ‘GEOPHY’, ‘PLANKTON’, ‘TRANSP’, ‘OPTICS’, ‘PP’, ‘MFLUX’, ‘WFLUX’, ‘HFLUX’, ‘SWH’, ‘SSH’, ‘REFLECTANCE’]
type – As an Enum field, it can be filtered using a reference <enum ‘DataType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘MY’, ‘MYINT’, ‘NRT’, ‘ANFC’, ‘HCST’, ‘MYNRT’]
level – Product level of the data. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
sensor – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
spatial_resolution – Spatial resolution, such as 4km, 1km, 300M. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
temporal_resolution – ISO8601 duration field can be tested against an ISODuration object or its string representation (PT1S, …)
typology – As an Enum field, it can be filtered using a reference <enum ‘Typology’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘I’, ‘M’]
version – As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseOHC(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseOHC- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseS1AOWI(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
FilesDatabase,PeriodMixinDatabase mapping to select and read S1A Ocean surface wind product Netcdf files in a local file system.
- layouts: list[Layout] | None = [<fcollections.core._listing.Layout object>]#
Semantic describing how the files are organized.
Useful to extract information and have an efficient file system scanning. The pre-configured layouts can mismatch the current files organization, in which case the user can build its own or set
enable_layoutsto False.
- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]
slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'acquisition_mode': <Parameter "acquisition_mode: fcollections.implementations._s1aowi.AcquisitionMode">, 'orbit': <Parameter "orbit: list[int] | slice | int">, 'product_type': <Parameter "product_type: fcollections.implementations._s1aowi.S1AOWIProductType">, 'resolution': <Parameter "resolution: list[int] | slice | int">, 'slice_post_processing': <Parameter "slice_post_processing: fcollections.implementations._s1aowi.S1AOWISlicePostProcessing">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]
slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, acquisition_mode: AcquisitionMode, slice_post_processing: S1AOWISlicePostProcessing, time: Period, resolution: list[int] | slice | int, orbit: list[int] | slice | int, product_type: S1AOWIProductType)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
acquisition_mode – Acquisition mode. As an Enum field, it can be filtered using a reference <enum ‘AcquisitionMode’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘IW’, ‘EW’, ‘WV’, ‘SM’]
slice_post_processing – Slices post-processing. As an Enum field, it can be filtered using a reference <enum ‘S1AOWISlicePostProcessing’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘CC’, ‘CM’, ‘OCN’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
resolution – SAR Ocean surface wind Level-2 product resolution. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
orbit – Orbit number As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
product_type – Product type. As an Enum field, it can be filtered using a reference <enum ‘S1AOWIProductType’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘SW’, ‘GS’]
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.core._readers.OpenMfDataset object>#
Files reader.
- reading_parameters = {'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- sort_keys: list[str] | str | None = 'time'#
Keys that specifies the fields used to sort the records extracted from the filenames.
Useful to order the files prior to reading them.
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseSST(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseSST- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, time: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'time': <Parameter "time: numpy.datetime64">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, time: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
time – Period covered by the file. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseSWH(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseSWH- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, sensorf: Sensors, time: Period, production_date: datetime64)#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'production_date': <Parameter "production_date: numpy.datetime64">, 'sensorf': <Parameter "sensorf: fcollections.implementations._definitions._cmems.Sensors">, 'time': <Parameter "time: fcollections.time._periods.Period">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- query(*, selected_variables: list[str] | None = None, bbox: tuple[float, float, float, float] | None = None, sensorf: Sensors, time: Period, production_date: datetime64)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – Variables that needs to be read. Set to None to read everything
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to subset data Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the -180/180 of longitude, data around the crossing and matching the bbox will be selected. (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
sensorf – As an Enum field, it can be filtered using a reference <enum ‘Sensors’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘C2’, ‘C2N’, ‘EN’, ‘ENN’, ‘E1’, ‘E1G’, ‘E2’, ‘G2’, ‘H2A’, ‘H2AG’, ‘H2B’, ‘J1’, ‘J1G’, ‘J1N’, ‘J2’, ‘J2N’, ‘J2G’, ‘J3’, ‘J3N’, ‘J3G’, ‘AL’, ‘ALG’, ‘S3A’, ‘S3B’, ‘S6A’, ‘S6A_LR’, ‘S6A_HR’, ‘SWON’, ‘SWONC’, ‘TP’, ‘TPN’, ‘ALLSAT’, ‘DEMO_ALLSAT_SWOTS’, ‘ALLSAT_SWOS’, ‘CFO’, ‘H2C’, ‘SWOT’, ‘GIR’, ‘PIR’, ‘PMW’, ‘OLCI’, ‘MULTI’]
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
production_date – Production date of a given file. The same granule is regenerated multiple times with updated corrections. Hence there can be multiple files for the same period, but with a different production date. As a DateTime field, it can be filtered by giving a reference Period, datetime. The tested value from the file name will be filtered out if it is not included or not equal to the reference Period or datetime respectively. The reference value can be given as a string or tuple of string following with the numpy date formatting [%Y-%m-%dT%H:%M:%S])
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoOpenMfDataset object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">}#
- variables_info()#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseSwotLRL2(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseSwotLRL2- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: L2Version, bbox: tuple[float, float, float, float])#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: fcollections.implementations._l2_lr_ssh.L2Version">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#
List of predicates that are built at each query.
The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: L2Version)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L2_LR_SSH product, composed of a CRID and a product counter. The CRID can be further decomposed with the timeliness (I/G), the baseline (A/B/C…) and the minor version (a number) (ex. PIC0). The product counter is a numberthat increased when a half orbit has been regenerated for the same crid. This can happens if an anomaly is detected or if there is a change in the upstream data. As a L2Version field, this field can be tested by providing another L2Version instance. This instance can be partially set, with some missing attributes set to None. In this case, the check will be performed on these attributes only.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL2LRSSH object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'left_swath': <Parameter "left_swath: 'bool' = True">, 'right_swath': <Parameter "right_swath: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'StackLevel | str' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">}#
- variables_info(*, level: ProductLevel, subset: ProductSubset)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseSwotLRL3(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseSwotLRL3- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, subset: ProductSubset, version: str, bbox: tuple[float, float, float, float])#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'level': <Parameter "level: fcollections.implementations._definitions._constants.ProductLevel">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False
swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#
List of predicates that are built at each query.
The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, level: ProductLevel, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False
swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
level – Product level of the data. As an Enum field, it can be filtered using a reference <enum ‘ProductLevel’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘L2’, ‘L3’, ‘L4’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL3LRSSH object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'nadir': <Parameter "nadir: 'bool' = False">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'stack': <Parameter "stack: 'str | StackLevel' = <StackLevel.NOSTACK: 1>">, 'subset': <Parameter "subset: 'ProductSubset'">, 'swath': <Parameter "swath: 'bool' = True">}#
- variables_info(*, subset: ProductSubset, version: str)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.NetcdfFilesDatabaseSwotLRWW(path: str, fs: AbstractFileSystem = LocalFileSystem(), enable_layouts: bool = True, follow_symlinks: bool = False)[source]#
Bases:
BasicNetcdfFilesDatabaseSwotLRWW- list_files(sort: bool = False, deduplicate: bool = False, unmix: bool = False, predicates: tp.Iterable[IPredicate] = (), stat_fields: tuple[str] = (), *, cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, subset: ProductSubset, version: str, bbox: tuple[float, float, float, float])#
List the files matching the given criteria.
- Parameters:
sort – Sort the results using the sort_keys attribute if this class
deduplicate – In case the class deduplicator is defined, the results are analyzed to search for duplicates according to a set of unique keys. In case duplicates are found, deduplication is run along a set of defined columns where duplicates are expected to occur
unmix – Multiple subsets may be mixed in the files metadata table. Use this argument to separate the subsets. An auto pick will also be performed according to the SubsetsUnmixer instance of this class. In case the auto pick cannot get a unique subset, an error is raised deduplication operation is done and if there are still duplicates, an error is raised
predicates – Additional complex filters to run on the record parsed by the filename. ex.
lambda record: record[1] in [1, 4, 5]. Predicates are knowledgeable about the record contents and the file name conventionstat_fields – File system information that can be retrieved from the fsspec underlying implementation. For example, ‘size’ or ‘created’ are valid for a local file system
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
ValueError – In case unmix is True, an error is raised if one unique and homogeneous subset cannot be extracted from the files metadata table
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- listing_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float]'">, 'cycle_number': <Parameter "cycle_number: list[int] | slice | int">, 'pass_number': <Parameter "pass_number: list[int] | slice | int">, 'subset': <Parameter "subset: fcollections.implementations._definitions._swot.ProductSubset">, 'time': <Parameter "time: fcollections.time._periods.Period">, 'version': <Parameter "version: str">}#
- map(func: tp.Callable[[xr_t.Dataset, dict[str, tp.Any]], tp.Any], *, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#
Map a function over dataset extracted from the files.
- Parameters:
func – Callable that works on a xarray dataset.
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
tile – Tile size of the spectrum computation. Is mandatory for the Extended subset
box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
NotImplementedError – In case dask is not available
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detected
- predicate_classes: list[type[IPredicate]] | None = [<class 'fcollections.implementations.optional._predicates.SwotGeometryPredicate'>]#
List of predicates that are built at each query.
The predicates intercepts the input parameters to build a custom record predicate. Usually, it is a complex test involving auxiliary data, such as ground track footprints or half_orbit/periods tables.
- query(*, subset: ProductSubset, selected_variables: list[str] | None = None, tile: int | None = None, box: int | None = None, bbox: tuple[float, float, float, float], cycle_number: list[int] | slice | int, pass_number: list[int] | slice | int, time: Period, version: str)#
Query a dataset by reading selected files in file system.
- Parameters:
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
tile – Tile size of the spectrum computation. Is mandatory for the Extended subset
box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension
cycle_number – Cycle number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
pass_number – Pass number of the half orbit. A half orbit is identified using a cycle number and a pass number. As a Integer field, it can be filtered by using a reference value. The reference value can either be a list, a slice or an integer. The tested value from the file name will be filtered out if it is outside the given list/slice or not equal to the integer value.
time – Period covered by the file. As a Period field, it can be filtered by giving a reference Period or datetime. The tested value from the file name will be filtered out if it does not intersect the reference Period or does not contain the reference datetime. The reference value can be given as a string or tuple of string following the [%Y-%m-%dT%H:%M:%S] formatting
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Returns:
A dataset containing the result of the query, or an None if there is
nothing matching the query
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- reader: IFilesReader | None = <fcollections.implementations.optional._reader.GeoSwotReaderL3WW object>#
Files reader.
- reading_parameters = {'bbox': <Parameter "bbox: 'tuple[float, float, float, float] | None' = None">, 'box': <Parameter "box: 'int | None' = None">, 'selected_variables': <Parameter "selected_variables: 'list[str] | None' = None">, 'subset': <Parameter "subset: 'ProductSubset'">, 'tile': <Parameter "tile: 'int | None' = None">}#
- variables_info(*, subset: ProductSubset, version: str)#
Returns the variables metadata.
Because the files collection may mix multiple subsets, we want to ensure that we return the variables of one subset only. The parameters of this method are the subset partitioning keys and can be given by the user to ensure a consistent set of variables. If the input parameters are not sufficient to unmix the subsets, the user will be notified with a ValueError
- Parameters:
subset – Subset of the LR Karin products. The Basic, Expert and Technical subsets are defined on a reference grid, opening the possibility of stacking the files, whereas the Unsmoothed subset is defined on a different grid for each cycle. The Light and Extended subset are specific to the L3_LR_WIND_WAVE product. As an Enum field, it can be filtered using a reference <enum ‘ProductSubset’> or its equivalent string. The tested value found in the file name will be filtered out if it is not equal to the given enum field. Possible values are: [‘Basic’, ‘Expert’, ‘WindWave’, ‘Unsmoothed’, ‘Technical’, ‘Light’, ‘Extended’]
version – Version of the L3_LR_WIND_WAVE and L3_LR_SSH Swot products (they share their versioning). This is a tri-number version x.y.z, where “x” denotes a major change in the product, “y” a minor change and “z” a fix. As a String field, it can filtered by giving a reference string. The tested value from the file name will be filtered out if it is not equal to the reference value.
- Raises:
LayoutMismatchError – In case
enable_layoutsis True and a mismatch between the layouts and the actual files is detectedValueError – In case if one unique and homogeneous subset could not be extracted from the files metadata table
- class fcollections.implementations.Origin(*values)[source]#
Bases:
EnumDataset origin.
- C3S = 2#
C3S.
- CCI = 3#
- CMEMS = 1#
Copernicus Marine.
- OSISAF = 4#
OSISAF.
- class fcollections.implementations.ProductClass(*values)[source]#
Bases:
EnumDataset product class.
- INS = 8#
In-situ.
- MOB = 7#
Multi observations.
- OC = 3#
Ocean Colour Thematic Assembly Center.
- SI = 4#
Sea Ice.
- SL = 2#
Sea Level Thematic Assembly Center.
- SST = 1#
Sea Surface Temperature Thematic Assembly Center.
- WAVE = 6#
Wave.
- WIND = 5#
Wind.
- class fcollections.implementations.ProductLevel(*values)[source]#
Bases:
EnumProduct level.
- L2 = 1#
Level-2 products.
- L3 = 2#
Level-3 products.
- L4 = 3#
Level-4 products.
- class fcollections.implementations.ProductSubset(*values)[source]#
Bases:
EnumSwot product subset enum.
- Basic = 1#
Basic subset for L2_LR_SSH and L3_LR_SSH products.
- Expert = 2#
Expert subset for L2_LR_SSH and L3_LR_SSH products.
Expert subset contains all of the Basic subset fields.
- Extended = 7#
Extended subset for L3_LR_WIND_WAVE product.
Extended subset contains all of the Light subset data.
- Light = 6#
Light subset for L3_LR_WIND_WAVE product.
- Technical = 5#
Technical subset for L3_LR_SSH product.
Contains additional fields such as alternative corrections to be used by experts.
- Unsmoothed = 4#
Expert subset for L2_LR_SSH and L3_LR_SSH products.
- WindWave = 3#
Basic subset for L2_LR_SSH product.
- class fcollections.implementations.S1AOWISlicePostProcessing(*values)[source]#
Bases:
Enum- CC = 1#
- CM = 2#
- OCN = 3#
- class fcollections.implementations.Sensors(*values)[source]#
Bases:
EnumAggregation of sensors for multiple CMEMS products.
SEALEVEL_GLO_PHY_L3_MY_008_062
SEALEVEL_GLO_PHY_L3_NRT_008_044
SEALEVEL_GLO_PHY_L4_NRT_008_046
SEALEVEL_GLO_PHY_L4_MY_008_047
WAVE_GLO_PHY_SWH_L3_NRT_014_001
SST_GLO_SST_L3S_NRT_OBSERVATIONS_010_010
OCEANCOLOUR_GLO_BGC_L3_MY_009_103
- AL = 21#
- ALG = 22#
- ALLSAT = 32#
- ALLSAT_SWOS = 34#
- C2 = 1#
- C2N = 2#
- CFO = 35#
- DEMO_ALLSAT_SWOTS = 33#
- E1 = 5#
- E1G = 6#
- E2 = 7#
- EN = 3#
- ENN = 4#
- G2 = 8#
- GIR = 38#
- H2A = 9#
- H2AG = 10#
- H2B = 11#
- H2C = 36#
- J1 = 12#
- J1G = 13#
- J1N = 14#
- J2 = 15#
- J2G = 17#
- J2N = 16#
- J3 = 18#
- J3G = 20#
- J3N = 19#
- MULTI = 42#
- OLCI = 41#
- PIR = 39#
- PMW = 40#
- S3A = 23#
- S3B = 24#
- S6A = 25#
- S6A_HR = 27#
- S6A_LR = 26#
- SWON = 28#
- SWONC = 29#
- SWOT = 37#
- TP = 30#
- TPN = 31#
- class fcollections.implementations.StackLevel(*values)[source]#
Bases:
EnumStack level for swath half orbits on reference grid.
Swath half orbits on a reference grid are by definition sampled at the same location for each cycle. This means we can split the temporal dimension
num_linesinto one or two other dimensionscycle_numberandpass_number- CYCLES = 2#
- CYCLES_PASSES = 3#
- NOSTACK = 1#
No stack, dataset will be returned as (num_lines, num_pixels)
- class fcollections.implementations.SwotPhases(*values)[source]#
Bases:
EnumSwot mission phases definitions.
- CALVAL = 1#
1-day repeat orbit, sparse geographical coverage.
- SCIENCE = 2#
21-day repeat orbit, quasi full geographical coverage.
- class fcollections.implementations.SwotReaderL2LRSSH(xarray_options: dict[str, str] | None = None)[source]#
Bases:
OpenMfDatasetReader for SWOT KaRIn L2_LR_SSH products.
- expected_coords: set[str] = {'latitude', 'longitude', 'time'}#
Variables we want set as coordinates in the output dataset
- read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), stack: StackLevel | str = StackLevel.NOSTACK, left_swath: bool = True, right_swath: bool = False, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#
Read a dataset from L2_LR_SSH products.
- Parameters:
files – list of files to open. At least one file should be given
fs – File systems hosting the files
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Product dataset (Basic, Expert, WindWave or Unsmoothed)
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
left_swath – Whether to load the left side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
right_swath – Whether to load the right side of the swath for Unsmoothed datasets. Set to False in conjunction to right_swath will disable swath reading for Expert and Basic dataset
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and WindWave datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
- Raises:
ValueError – If the input list of files is empty
ValueError – If the input stack parameter is not matching a valid StackLevel
- Return type:
An xarray dataset containing the dataset from the input files
- class fcollections.implementations.SwotReaderL3LRSSH(xarray_options: dict[str, str] | None = None)[source]#
Bases:
OpenMfDatasetReader for SWOT KaRIn L3_LR_SSH products.
- clipped_ssha: set[str] = {'ssha_filtered', 'ssha_noiseless', 'ssha_unedited', 'ssha_unfiltered'}#
SSHA Variables with nadir data clipped in it. Should cover both present and older versions of the product
- expected_coords: set[str] = {'latitude', 'longitude', 'time'}#
Variables we want set as coordinates in the output dataset
- read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), stack: str | StackLevel = StackLevel.NOSTACK, swath: bool = True, nadir: bool = False, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#
Read a dataset from L2_LR_SSH products.
- Parameters:
files – list of files to open. At least one file should be given
fs – File systems hosting the files
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Product dataset (Basic, Expert, Technical or Unsmoothed)
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
stack – Whether to stack the cycles and passes of the dataset. This option is only available for Basic, Expert and Technical datasets which are defined on a reference grid (fixed grid between cycles). Set to CYCLES_PASSES to stack both cycles and passes. Set to CYCLES to stack only the cycles, in which case cycles with missing passes will be left over. Defaults to NOSTACK
nadir – Whether to read the nadir data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to False
swath – Whether to read the swath data from the product. Only relevant the Basic and Expert subsets where the nadir data is clipped in the swath. Defaults to True
- Raises:
ValueError – If stack=CYCLES_PASSES or stack=CYCLES, swath=False and nadir=True. In this case, we are trying to stack nadir data which is not guaranteed to have the same number of points per half orbit. This is not supported case
ValueError – If swath=False and nadir=False. In this case, the user is asking for an empty return
ValueError – If the input list of files is empty
ValueError – If the input stack parameter is not matching a valid StackLevel
- Returns:
An xarray dataset containing the dataset from the input files
- class fcollections.implementations.SwotReaderL3WW(xarray_options: dict[str, str] | None = None)[source]#
Bases:
OpenMfDatasetReader for the SWOT L3_LR_WIND_WAVE product.
This reader handles both the Light and Extended subsets. The Light subset is simpler and has spectral content along the n_box dimension with a default tile and box sizes. The Extended subset has multiple tile and box sizes stored in matching netcdf groups. Thus, tile and box size should generally be given for the Extended subset (see the read method for more details).
The L3_LR_WIND_WAVE product is built from the L3_LR_SSH product, and references ‘num_lines’ indices.
See also
SwotReaderL3LRSSHthe L3_LR_SSH product reader
- read(subset: ProductSubset, files: list[str], selected_variables: list[str] | None = None, fs: AbstractFileSystem = fs_loc.LocalFileSystem(), tile: int | None = None, box: int | None = None, preprocessor: Callable[[Dataset], Dataset] | None = None) Dataset[source]#
Read a SWOT dataset from LR_SSH products.
- Parameters:
files – list of files to open. At least one file should be given. If multiple files are given, variables following the n_box dimension will be concatenated. The others variables are constant and will not be repeated
fs – File systems hosting the files
selected_variables – list of variables to select in dataset. Set to None (default) to disable the selection
subset – Product dataset (Light, Extended)
bbox – the bounding box (lon_min, lat_min, lon_max, lat_max) used to select the data in a given area. Longitude coordinates can be provided in [-180, 180[ or [0, 360[ convention. If bbox’s longitude crosses the circularity, it will be split in two subboxes to ensure a proper selection (e.g. longitude interval: [170, -170] -> data in [170, 180[ and [-180, -170] will be retrieved)
tile – Tile size of the spectrum computation. Is mandatory for the Extended subset
box – Box size of the spectrum computation. Is mandatory for the Extended subset if one the requested variables is defined along the n_box dimension
- Raises:
ValueError – If tile or box argument are given when reading a Light subset
ValueError – If the list of files is empty
ValueError – If the tile and box argument is missing for the Extended subset
ValueError – If the input subset does not match neither Light nor Extended
ValueError – If the input tile or box size is not found in the files
- Return type:
An xarray dataset containing the dataset from the input files
- class fcollections.implementations.Temporality(*values)[source]#
Bases:
EnumTemporality of the L3_LR_SSH product.
The L3_LR_SSH product is calibrated on nadir data. Nadir data has two temporalities in Copernicus Marine: reprocessed data (labelled as Multi-Year MY) and near-real time data (NRT).
This temporality in upstream data is reflected as reproc/forward to adopt the SWOT mission denomination. It is not the same definition as the L2_LR_SSH, where reprocess data covers PGC, PGD, … datasets and forward data covers PIC, PID, …
See also
fcollections.implementations.DataTypeCopernicus Marine data type definition (in our case for Nadir data)
fcollections.implementations.TimelinessL2_LR_SSH product temporality definition
- FORWARD = 2#
Forward data calibrated on the NRT nadir dataset.
- REPROC = 1#
Reprocessed data calibrated on the MY nadir dataset.
- class fcollections.implementations.Thematic(*values)[source]#
Bases:
EnumDataset thematic.
- BGC = 2#
Biogeochemical.
- PHY = 1#
Physical.
- PHYBGC = 4#
Phy BGC.
- PHYBGCWAV = 5#
Wav Phy BGC.
- WAV = 3#
Wav.
- class fcollections.implementations.Timeliness(*values)[source]#
Bases:
EnumTimeliness of the SWOT L2_LR_SSH products.
- G = 2#
Reprocessed data.
- I = 1#
Forward data.
- class fcollections.implementations.Typology(*values)[source]#
Bases:
EnumDataset typology.
- I = 1#
Instantaneous.
- M = 2#
Mean.
- class fcollections.implementations.Variable(*values)[source]#
Bases:
EnumDataset variable group.
- CAR = 4#
Carbon.
- CHL = 3#
Chlorophyll.
- CUR = 2#
Currents.
- GEOPHY = 6#
Geophy.
- HFLUX = 13#
Heat flux.
- MFLUX = 11#
Momentum flux.
- NUT = 5#
Nutrient.
- OPTICS = 9#
Optics.
- PLANKTON = 7#
Plancton.
- PP = 10#
Primary production.
- REFLECTANCE = 16#
Reflectance.
- SSH = 15#
Sea surface Height.
- SWH = 14#
Surface Wave Height.
- TEMP = 1#
Temperature.
- TRANSP = 8#
Transparency.
- WFLUX = 12#
Water flux.
- fcollections.implementations.build_version_parser() FileNameConvention[source]#
Build file name convention to parse CRID versions.