API

Documentation of the core API of pyaro.

Pyaro

pyaro.list_timeseries_engines() → dict[str, Engine][source]

Return a dictionary of available timeseries_readers and their objects.

Return type:: dictionary

Notes

This function lives in the backends namespace (engs=pyaro.list_timeseries_engines()). More information about each reader is available via the TimeseriesEngine obj.url() and obj.description()

# New selection mechanism introduced with Python 3.10. See GH6514.

pyaro.open_timeseries(name, *args, **kwargs) → Reader[source]

open a timeseries reader directly, sending args and kwargs directly to the TimeseriesReader.open_reader() function

Parameters:: name – the name of the entrypoint as key in list_timeseries_readers
Returns:: an implementation-object of a TimeseriesReader opened to a location

pyaro.timeseries_data_to_pd(data: Data)[source]

Convert pyaro.Data to a pandas dataframe

Parameters:: data – a pyaro Data object
Returns:: a pandas dataframe

pyaro.timeseries - User API

class pyaro.timeseries.Reader(filename_or_obj_or_url, *, filters=None)[source]

Baseclass for timeseries. This can be used with a context manager

abstract close() → None[source]

Cleanup code for the reader.

This method will automatically be called when going out of context. Implement as dummy (pass) if no cleanup needed.

abstract data(varname: str) → Data[source]

Return all data for a variable

Parameters:: varname – variable name as returned from variables
Returns:: a data object

metadata() → dict[str, str][source]

Metadata set by the datasource.

The reader-implementation might add metadata depending on the data-source to this method.

:return dictionary with different metadata

abstract stations() → dict[str, Station][source]

Dictionary of all stations available for this reader.

Returns:: dictionary with station-id as returned from data to Station metadata.

abstract variables() → list[str][source]

List all variables available in this reader.

The variable-names returned here should already be change if a VariableNameChanger is used.

Returns:: List of variables names.

class pyaro.timeseries.Data[source]

Baseclass for data returned from a pyaro.timeseries.Reader.

This is the minimum set of columns required for a reader to return. A reader is welcome to return a self-implemented subclass of Data.

abstract property altitudes: ndarray

A 1-dimensional array of altitudes (float)

Returns:: 1dim array of floats

abstract property end_times: ndarray

A 1-dimensional array of int64 datetimes indicating the end of the measurement

Returns:: 1dim array of datetime64

abstract property flags: ndarray

A 1-dimensional array of flags as defined in pyaro

Returns:: 1dim array of ints

abstract keys()[source]: all available data-fields, excluding variable and units which are considered metadata

abstract property latitudes: ndarray

A 1-dimensional array of latitudes (float)

Returns:: 1dim array of floats

abstract property longitudes: ndarray

A 1-dimensional array of longitudes (float)

Returns:: 1dim array of floats

abstract slice(index)[source]

Get a copy of this dataset as a slice.

Parameters:: index – A boolean index of the size of data or integer. array
Returns:: a new Data object

abstract property standard_deviations: ndarray

A 1-dimensional array of stdevs. NaNs describe not available stdev per measurement

Returns:: 1dim array of floats

abstract property start_times: ndarray

A 1-dimensional array of int64 datetimes indicating the start of the measurement

Returns:: 1dim array of datetime64

abstract property stations: ndarray

A 1-dimensional array of station identifiers (strings, usually name)

Returns:: 1dim array of strings, max-length 64-chars

abstract property units: str

Units in CF-notation, the same unit applies to all values

Returns:: Units in CF-notation

abstract property values: ndarray

A 1-dimensional float array of values.

Returns:: 1dim array of floats

abstract property variable: str

Variable name for all the data

Returns:: variable name

class pyaro.timeseries.Station(fields: dict | None = None, metadata: dict | None = None)[source]

Baseclass for a station returned from a pyaro.timeseries.Reader.

This is the minimum set of columns required for a reader to return. A reader is welcome to return a self-implemented subclass of Station.

All Station fields are accessible as a dict or as property, e.g. ` td = Station() print(td.station) print(td["station"]) `

property altitude: float

altitude in range [-180, 180]

Returns:: altitude

property country: str

Station country as ISO 3166-2 code

Returns:: country

init_kwargs() → dict[str, dict][source]

implement a dict representation of this class to make it easier json serializable. Station(**another_station.init_kwargs()) should make a copy of the station.

Returns:: a dict representation.

keys()[source]: all available data-fields, excluding variable and units which are considered metadata

property latitude: float

Latitude in range [-90, 90]

Returns:: latitude

property long_name: str

Long station name, does not need to be unique.

Returns:: long name

property longitude: float

Longitude in range [-180, 180]

Returns:: longitude

property metadata: dict

set_fields(fields: dict)[source]

Initialization code for this station. Only known data-fields will be read from data, i.e. it is not possible to extend TimeseriesData without subclassing.

Parameters:: fields – dict with the required fields: station, latitude, longitude, altitude, long_name, country, url
Raises:: KeyError – on missing field

property station: str

Station name, unique for the reader.

Returns:: station name

property url: str

url to more information about the station

Returns:: url

class pyaro.timeseries.Flag(value)[source]

Flag of measurement data.

Parameters:: IntEnum – all flags are simple integers

BELOW_THRESHOLD = 2

INVALID = 1

VALID = 0

pyaro.timeseries.filters - Filters

class pyaro.timeseries.Filter.FilterCollection(filterlist=[])[source]

Bases: object

A collection of DataIndexFilters which can be applied together.

Parameters:: filterlist – _description_, defaults to []
Returns:: _description_

add(difilter: DataIndexFilter)[source]

filter(ts_reader, variable: str) → Data[source]

Filter the data for a variable of a reader with all filters in this collection.

Parameters:

ts_reader – a timeseries-reader instance
variable – a valid variable-name

Returns:

filtered data

filter_data(data: Data, stations: dict[str, Station], variables: str) → Data[source]

Filter data with all filters in this collection.

Parameters:

data – Data from a timeseries-reader, i.e. retrieved by ts.data(varname)
stations – stations-dict of a reader, i.e. retrieved by ts.stations()
variables – variables of a reader, i.e. retrieved by ts.variables()

Returns:

_description_

class pyaro.timeseries.Filter.FilterFactory[source]

Bases: object

get(name, **kwargs)[source]

Get a filter by name. If kwargs are given, they will be send to the filters new method

Parameters:: name – a filter-name
Returns:: a filter, optionally initialized

instance = <pyaro.timeseries.Filter.FilterFactory object>

list() → dict[str, Filter][source]: List all available filter-names and their initializations

register(filter: Filter)[source]

Register a new filter to the factory with a filter object (might be empty)

Parameters:: filter – a filter implementation

class pyaro.timeseries.Filter.AltitudeFilter(min_altitude: float | None = None, max_altitude: float | None = None)[source]

Bases: StationReductionFilter

Filter which filters stations based on their altitude. Can be used to filter for a minimum and/or maximum altitude.

:param min_altitude : float of minimum altitude in meters required to keep the station (inclusive). :param max_altitude : float of maximum altitude in meters required to keep the station (inclusive).

If station elevation is nan, it is always excluded.

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.BoundingBoxFilter(include: list[tuple[float, float, float, float]] = [], exclude: list[tuple[float, float, float, float]] = [])[source]

Bases: StationReductionFilter

Filter using geographical bounding-boxes. Coordinates should be given in the range [-180,180] (degrees_east) for longitude and [-90,90] (degrees_north) for latitude. Order of coordinates is clockwise starting with north, i.e.: (north, east, south, west) = NESW

Parameters:

include – bounding boxes to include. Each bounding box is a tuple of four float for (NESW), defaults to [] meaning no restrictions
exclude – bounding boxes to exclude. Defaults to []

Raises:

BoundingBoxException – on any errors of the bounding boxes

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

has_location(latitude, longitude)[source]

Test if the locations coordinates are part of this filter.

Parameters:

latitude – latitude coordinate in degree_north [-90, 90]
longitude – longitude coordinate in degree_east [-180, 180]

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.CountryFilter(include: list[str] = [], exclude: list[str] = [])[source]

Bases: StationReductionFilter

Filter countries by ISO2 names (capitals!)

Parameters:

include – countries to include, defaults to [], meaning all countries
exclude – countries to exclude, defaults to [], meaning none

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

has_country(country) → bool[source]

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.DuplicateFilter(duplicate_keys: list[str] | None = None)[source]

Bases: DataIndexFilter

remove duplicates from the data. By default, data with common station, start_time, end_time are consider duplicates. Only one of the duplicates is kept.

Parameters:: duplicate_keys – list of data-fields/columns, defaults to None, being the same as [“stations”, “start_times”, “end_times”]

default_keys = ['stations', 'start_times', 'end_times']

filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str])[source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.FlagFilter(include: list[Flag] = [], exclude: list[Flag] = [])[source]

Bases: DataIndexFilter

Filter data by Flags

Parameters:

include – flags to include, defaults to [], meaning all flags
exclude – flags to exclude, defaults to [], meaning none

filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str])[source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

usable_flags()[source]

class pyaro.timeseries.Filter.RelativeAltitudeFilter(topo_file: str | None = None, topo_var: str = 'topography', rdiff: float = 0)[source]

Bases: StationFilter

Filter class which filters stations based on the relative difference between the station altitude, and the gridded topography altitude.

Parameters:

topo_file – A .nc file from which to read gridded topography data.
topo_var – Name of variable that stores altitude.
rdiff – Relative difference (in meters).

Note:

Stations will be kept if abs(altobs-altmod) <= rdiff.
Stations will not be kept if station altitude is NaN.

Note:

This filter requires additional dependencies (xarray, netcdf4, cf-units) to function. These can be installed with `pip install .[optional]

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.StationFilter(include: list[str] = [], exclude: list[str] = [])[source]

Bases: StationReductionFilter

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

has_station(station) → bool[source]

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

Bases: DataIndexFilter

Filter data by start and/or end-times of the measurements. Each timebound consists of a bound-start and bound-end (both included). Timestamps are given as YYYY-MM-DD HH:MM:SS in UTC

Parameters:

start_include – list of tuples of start-times, defaults to [], meaning all
start_exclude – list of tuples of start-times, defaults to []
startend_include – list of tuples of start and end-times, defaults to [], meaning all
startend_exclude – list of tuples of start and end-times, defaults to []
end_include – list of tuples of end-times, defaults to [], meaning all
end_exclude – list of tuples of end-times, defaults to []

Raises:

TimeBoundsException – on any errors with the time-bounds

Examples:

end_include: [(“2023-01-01 10:00:00”, “2024-01-01 07:00:00”)] will only include observations where the end time of each observation is within the interval specified (i.e. “end” >= 2023-01-01 10:00:00 and “end” <= “2024-01-01 07:00:00”)

Including multiple bounds will act as an OR, allowing multiple selections. If we want every observation in January for 2021, 2022, 2023, and 2024 this could be made as the following filter:

startend_include: [
    ("2021-01-01 00:00:00", "2021-02-01 00:00:00"),
    ("2022-01-01 00:00:00", "2022-02-01 00:00:00"),
    ("2023-01-01 00:00:00", "2023-02-01 00:00:00"),
    ("2024-01-01 00:00:00", "2024-02-01 00:00:00"),
]

contains(dt_start: ndarray[tuple[int, ...], dtype[datetime64]], dt_end: ndarray[tuple[int, ...], dtype[datetime64]]) → ndarray[tuple[int, ...], dtype[bool]][source]

Test if datetimes in dt_start, dt_end belong to this filter

Parameters:

dt_start – start of each observation as a numpy array of datetimes
dt_end – end of each observation as a numpy array of datetimes

Returns:

numpy boolean array with True/False values

envelope() → tuple[datetime, datetime][source]

Get the earliest and latest time possible for this filter.

Returns:: earliest start and end-time (approximately)
Raises:: TimeBoundsException – if has_envelope() is False, or internal errors

filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str]) → ndarray[tuple[int, ...], dtype[bool]][source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

has_envelope() → bool[source]: Check if this filter has an envelope, i.e. a earliest and latest time

init_kwargs() → dict[str, list[tuple[str, str]]][source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.TimeResolutionFilter(resolutions: list[str] = [])[source]

Bases: DataIndexFilter

The timeresolution filter allows to restrict the observation data to certain time-resolutions. Time-resolutions are not exact, and might be interpreted slightly differently by different observation networks.

Default named time-resolutions are

minute: 59 to 61 s (+-1sec)
hour: 59*60 s to 61*60 s (+-1min)
day: 22:59:00 to 25:01:00 to allow for leap-days and a extra min
week: 6 to 8 days (+-1 day)
month: 27-33 days (30 +- 3 days)
year: 360-370 days (+- 5days)

Parameters:: resolutions – a list of wanted time resolutions. A resolution consists of a integer

number and a time-resolution name, e.g. 3 hour (no plural).

filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str])[source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

named_resolutions = {'day': (82740, 90060), 'hour': (3540, 3660), 'minute': (59, 61), 'month': (2332800, 2851200), 'week': (518400, 691200), 'year': (31104000, 31968000)}

pattern = re.compile('\\s*(\\d+)\\s*(\\w+)\\s*')

class pyaro.timeseries.Filter.TimeVariableStationFilter(exclude=[], exclude_from_csvfile='')[source]

Bases: DataIndexFilter

Exclude combinations of variable station and time from the data

This filter is really a cleanup of the database, but sometimes it is not possible to modify the original database and the cleanup needs to be done on a filter basis.

Parameters:

exclude – tuple of 4 elements: start-time, end-time, variable, station
exclude_from_csvfile –
this is a helper option to enable a large list of excludes to be read from a “ “ separated file with columns

start end variable station

where start and end are timestamps of format YYYY-MM-DD HH:MM:SS in UTC, e.g. the year 2020 is:

2020-01-01 00:00:00 2020-12-31 23:59:59 …

filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str])[source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.ValleyFloorRelativeAltitudeFilter(topo: str | None = None, *, radius: float = 5000, topo_var: str = 'Band1', lower: float | None = None, upper: float | None = None, keep_nan: bool = True)[source]

Bases: StationFilter

Filter for filtering stations based on the difference between the station altitude and valley floor altitude (defined as the lowest altitude within a radius around the station). This ensures that plateau sites are treated like “surface” sites, while sites in hilly or mountaineous areas (eg. Schauinsland) are considered mountain sites. This approach has been used by several papers (eg. Fowler et al., Lloibl et al. 1994).

Parameters:

topo – Topography file path (either a file or a directory). Must be a dataset openable by xarray, with latitude and longitude stored as “lat” and “lon” respectively. The variable that contains elevation data is assumed to be in meters. If topo is a directory, a metadata.json file containing the geographic bounds of each file must be present (see below for example).
radius – Radius (in meters)
topo_var – Variable name to use in topography dataset
lower – Optional lower bound needed for relative altitude for station to be kept (in meters)
upper – Optional upper bound needed for relative altitude for station to be kept (in meters)
keep_nan – Whether to keep values where relative altitude is calculated as nan. Defaults to True. Note: Since the topography does not contain values for oceans this may happen for small islands and coastal stations.

Raises:

ModuleNotFoundError – If necessary required additional dependencies (cf_units, xarray) are not available.

Note

This implementation is only tested with GTOPO30 dataset to far.

Available versions of gtopo30 can be found here: /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/GTOPO30/

Note

metadata.json should contain a mapping from each nc file, to it’s geographic latitude/longitude bounds.

For example:

``` {

“N.nc”: {
“w”: -180, “e”: 180, “n”: 90, “s”: -10

}, “S.nc”: {

“w”: -180, “e”: 180, “n”: -10, “s”: -90

}

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

class pyaro.timeseries.Filter.VariableNameFilter(reader_to_new: dict[str, str] = {}, include: list[str] = [], exclude: list[str] = [])[source]

Bases: Filter

Filter to change variable-names and/or include/exclude variables

Parameters:

reader_to_new – dictionary from readers-variable names to new variable-names, e.g. used in your project, defaults to {}
include – list of variables to include only (new names if changed), defaults to [] meaning keep all variables unless excluded.
exclude – list of variables to exclude (new names if changed), defaults to []

filter_data(data, stations, variables) → Data[source]: Translate data’s variable

filter_variables(variables: list[str]) → list[str][source]

change variable name and reduce variables applying include and exclude parameters

Parameters:: variables – variable names as in the reader
Returns:: valid variable names in translated nomenclature

has_reader_variable(variable) → bool[source]

Check if variable-name is in the list of variables applying include and exclude

Parameters:: variable – variable as returned from the reader
Returns:: True or False

has_variable(variable) → bool[source]

check if a variable-name is in the list of variables applying include and exclude

Parameters:: variable – variable name in translated, i.e. new scheme
Returns:: True or False

init_kwargs()[source]: return the init kwargs

name()[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

new_varname(reader_variable: str) → str[source]

convert a reader-variable to a new variable name

Parameters:: reader_variable – variable as used in the reader
Returns:: variable name after translation

reader_varname(new_variable: str) → str[source]

convert a new variable name to a reader-variable name

Parameters:: new_variable – variable name after translation
Returns:: variable name in the original reader

pyaro.timeseries - Dev API

class pyaro.timeseries.Engine[source]

The engine is the ‘singleton’ generator object for databases of the engines type.

abstract property args: list[str]: return a tuple of parameters to be passed to open_timeseries, including the mandatory filename_or_obj_or_url parameter.

abstract property description: Get a descriptive string about this pyaro implementation.

abstract open(filename_or_obj_or_url, *, filters=None)[source]

open-function of the timeseries, initializing the reader-object, i.e. equivalent to Reader(filename_or_object_or_url, *, filters)

:return pyaro.timeseries.Reader :raises UnknownFilterException

abstract property supported_filters: list[str]

The class-names of the supported filters by this reader.

If the reader is called with a filter which is not a instance of this class, it is supposed to raise a UnknownFilterException. Using a subclass of a filter is not allowed unless explicitly listed here.

Returns:: list of classnames

abstract property url

Get a url about more information, docs of the datasource-engine.

This should be the github-url or similar of the implementation.

class pyaro.timeseries.NpStructuredData(variable: str = '', units: str = '')[source]

An implementation of Data using numpy Structured Arrays.

This is the minimum set of columns required for a reader to return. A reader is welcome to return a self-implemented subclass of Data.

Data can be added by rows with the append method, or a completed numpy.StructuredArray can be submitted using set_data.

property altitudes: ndarray

A 1-dimensional array of altitudes (float)

Returns:: 1dim array of floats

append(value, station, latitude, longitude, altitude, start_time, end_time, flag=Flag.VALID, standard_deviation=nan)[source]

append with a new data-row, or numpy arrays

:param value :param station :param latitude :param longitude :param altitude :param start_time :param end_time :param flag: defaults to Flag.VALID :param standard_deviation: defaults to np.nan

property end_times: ndarray

A 1-dimensional array of int64 datetimes indicating the end of the measurement

Returns:: 1dim array of datetime64

property flags: ndarray

A 1-dimensional array of flags as defined in pyaro

Returns:: 1dim array of ints

keys()[source]: all available data-fields, excluding variable and units which are considered metadata

property latitudes: ndarray

A 1-dimensional array of latitudes (float)

Returns:: 1dim array of floats

property longitudes: ndarray

A 1-dimensional array of longitudes (float)

Returns:: 1dim array of floats

set_data(variable: str, units: str, data: array)[source]

Initialization code for the data. Only known data-fields will be read from data, i.e. it is not possible to extend TimeseriesData without subclassing.

Parameters:

variable – variable name
units – variable units
data – a numpy structured array with all fields (see append)

Raises:

KeyError – on missing field
Exception – if not all data-ndarrays have same size
Exception – if not all data-fields are ndarrays

slice(index)[source]

Get a copy of this dataset as a slice.

Parameters:: index – A boolean index of the size of data or integer. array
Returns:: a new Data object

property standard_deviations: ndarray

A 1-dimensional array of stdevs. NaNs describe not available stdev per measurement

Returns:: 1dim array of floats

property start_times: ndarray

A 1-dimensional array of int64 datetimes indicating the start of the measurement

Returns:: 1dim array of datetime64

property stations: ndarray

A 1-dimensional array of station identifiers (strings, usually name)

Returns:: 1dim array of strings, max-length 64-chars

property units: str

Units in CF-notation, the same unit applies to all values

Returns:: Units in CF-notation

property values: ndarray

A 1-dimensional float array of values.

Returns:: 1dim array of floats

property variable: str

Variable name for all the data

Returns:: variable name

class pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine[source]

The AutoFilterEngine class implements the supported_filters and args method using introspection from the corresponding reader-class. The reader_class method needs therefore to be implemented by this class.

_abc_impl = <_abc._abc_data object>

args()[source]: return a tuple of parameters to be passed to open_timeseries, including the mandatory filename_or_obj_or_url parameter.

open(filename, *args, **kwargs) → Reader[source]

open-function of the timeseries, initializing the reader-object, i.e. equivalent to Reader(filename_or_object_or_url, *, filters)

:return pyaro.timeseries.Reader :raises UnknownFilterException

abstract reader_class() → AutoFilterReader[source]

return the class of the corresponding reader

Returns:: the class returned from open

supported_filters() → list[Filter][source]

The supported filters by this Engine. Maps to the Readers supported_filters.

Returns:: a list of filters

class pyaro.timeseries.AutoFilterReaderEngine.AutoFilterReader(filename_or_obj_or_url, *, filters=None)[source]

This helper class applies automatically all filters on the Reader methods Reader.data, Reader.stations and Reader.variables. For this to work, the reader needs to implement _unfiltered_data, _unfiltered_stations and _unfiltered_variables.

It adds also an overwritable classmethod supported_filters() listing all possible filters. This is both used for the AutoFilterEngine, and for the check_filters method which should be used during initialization when filters are given.

The implementation must also use _set_filters() to add the filters from __init__.

_abc_impl = <_abc._abc_data object>

_get_filters() → list[Filter][source]

Get a list of filters actually set during initialization of this object.

Returns:: list of filters

_set_filters(filters)[source]

abstract _unfiltered_data(varname) → Data[source]

abstract _unfiltered_stations() → dict[str, Station][source]

abstract _unfiltered_variables() → list[str][source]

data(varname) → Data[source]

Return all data for a variable

Parameters:: varname – variable name as returned from variables
Returns:: a data object

stations() → dict[str, Station][source]

Dictionary of all stations available for this reader.

Returns:: dictionary with station-id as returned from data to Station metadata.

classmethod supported_filters() → list[Filter][source]

Get the default list of implemented filters.

Returns:: list of filters

variables() → list[str][source]

List all variables available in this reader.

The variable-names returned here should already be change if a VariableNameChanger is used.

Returns:: List of variables names.

class pyaro.timeseries.Filter.DataIndexFilter(**kwargs)[source]

Bases: Filter

A abstract baseclass implementing filter_data by an abstract method filter_data_idx

filter_data(data: Data, stations: dict[str, Station], variables: list[str]) → Data[source]

Filtering of data

Parameters:

data – Data of e.g. a Reader.data(varname) call
stations – List of stations, e.g. from a Reader.stations() call
variables – variables, i.e. from a Reader.variables() call

Returns:

a updated Data-object with this filter applied

abstract filter_data_idx(data: Data, stations: dict[str, Station], variables: list[str])[source]

Filter data to an index which can be applied to Data.slice(idx) later

Returns:: a index for Data.slice(idx)

class pyaro.timeseries.Filter.Filter(**kwargs)[source]

Bases: ABC

Base-class for all filters used from pyaro-Readers

args() → dict[str, Any][source]

retrieve the kwargs possible to retrieve a new object of this filter with filter restrictions

Returns:: a dictionary possible to use as kwargs for the new method

filter_data(data: Data, stations: dict[str, Station], variables: list[str]) → Data[source]

Filtering of data

Parameters:

data – Data of e.g. a Reader.data(varname) call
stations – List of stations, e.g. from a Reader.stations() call
variables – variables, i.e. from a Reader.variables() call

Returns:

a updated Data-object with this filter applied

filter_stations(stations: dict[str, Station]) → dict[str, Station][source]

Filtering of stations list

Parameters:: stations – List of stations, e.g. from a Reader.stations() call
Returns:: dict of filtered stations

filter_variables(variables: list[str]) → list[str][source]

Filtering of variables

Parameters:: variables – List of variables, e.g. from a Reader.variables() call
Returns:: List of filtered variables.

abstract init_kwargs() → dict[source]: return the init kwargs

abstract name() → str[source]

Return a unique name for this filter

Returns:: a string to be used by FilterFactory

time_format = '%Y-%m-%d %H:%M:%S'

csvreader for timeseries

A simple implementation of a timeseries reader based on csv-files, usually accessed as pyaro.open-timeseries('csv_timeseries', ...)

class pyaro.csvreader.CSVTimeseriesReader(filename, columns={'altitude': '0', 'country': 'NO', 'end_time': 7, 'flag': '0', 'latitude': 3, 'longitude': 2, 'standard_deviation': 'NaN', 'start_time': 6, 'station': 1, 'units': 5, 'value': 4, 'variable': 0}, variable_units: dict[str, str] = {}, country_lookup=False, csvreader_kwargs={'delimiter': ','}, skip_header_rows: int = 0, filters=[])[source]

close()[source]

Cleanup code for the reader.

This method will automatically be called when going out of context. Implement as dummy (pass) if no cleanup needed.

classmethod col_keys()[source]

Column keys possible to initialize with this reader.

Returns:: list of columns possible to initialize with columns argument of this reader

metadata() → dict[source]

Metadata set by the datasource.

The reader-implementation might add metadata depending on the data-source to this method.

:return dictionary with different metadata