Pyaro basic example
Install pyaro and check if installation is new enough:
[18]:
import pyaro
pyaro.__version__
[18]:
'0.0.5'
Check a list of installed engines. The most basic installation will install only the
csv_timeseriesengine. Install e.g.https://github.com/metno/pyaro-readersfor many more engines.
[19]:
pyaro.list_timeseries_engines()
[19]:
{'csv_timeseries': <pyaro.csvreader.CSVTimeseriesReader.CSVTimeseriesEngine at 0x7ff77705f250>}
Learn a bit about the engine.
[20]:
pr_csv = pyaro.list_timeseries_engines()['csv_timeseries']
help(pr_csv)
Help on CSVTimeseriesEngine in module pyaro.csvreader.CSVTimeseriesReader object:
class CSVTimeseriesEngine(pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine)
| Method resolution order:
| CSVTimeseriesEngine
| pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine
| pyaro.timeseries.Engine.Engine
| abc.ABC
| builtins.object
|
| Methods defined here:
|
| description(self)
| Get a descriptive string about this pyaro implementation.
|
| open(self, filename, *args, **kwargs) -> pyaro.csvreader.CSVTimeseriesReader.CSVTimeseriesReader
| open-function of the timeseries, initializing the reader-object, i.e.
| equivalent to Reader(filename_or_object_or_url, *, filters)
|
| :return pyaro.timeseries.Reader
| :raises UnknownFilterException
|
| reader_class(self)
| return the class of the corresponding reader
|
| :return: the class returned from open
|
| url(self)
| Get a url about more information, docs of the datasource-engine.
|
| This should be the github-url or similar of the implementation.
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| __abstractmethods__ = frozenset()
|
| ----------------------------------------------------------------------
| Methods inherited from pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine:
|
| args(self)
| return a tuple of parameters to be passed to open_timeseries, including
| the mandatory filename_or_obj_or_url parameter.
|
| supported_filters(self) -> [<class 'pyaro.timeseries.Filter.Filter'>]
| The supported filters by this Engine. Maps to the Readers supported_filters.
|
| :return: a list of filters
|
| ----------------------------------------------------------------------
| Data descriptors inherited from pyaro.timeseries.Engine.Engine:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
Check the description and the open-arguments to open a database with this engine:
[21]:
print(pr_csv.description())
print(pr_csv.args())
Simple reader of csv-files using python csv-reader
('filename', 'columns', 'variable_units', 'country_lookup', 'csvreader_kwargs', 'filters')
Opening a datasource with an engine
Open now the timeseries ts with a table. You could do that with a with clause in larger code, but for simplicity, we don’t do that here. columns map the files columns to the data, starting with first column as 0, which contains the variable-name in our example file.
The test-file is read using the python csv module. csvreader_kwargs sets up that module, i.e. comma-separated setting the delimiter.
[22]:
file = "../../tests/testdata/csvReader_testdata.csv"
columns = {
"variable": 0,
"station": 1,
"longitude": 2,
"latitude": 3,
"value": 4,
"units": 5,
"start_time": 6,
"end_time": 7,
"altitude": "0",
"country": "NO",
"standard_deviation": "NaN",
"flag": "0",
}
csvreader_kwargs = {"delimiter": ","}
ts = pyaro.open_timeseries('csv_timeseries',
filename=file,
columns=columns,
csvreader_kwargs=csvreader_kwargs,
filters=[])
ts is now the handle to the data-source.
Accessing metadata in the datasource, like available variables and stations
[23]:
print(ts.variables())
print(ts.stations())
dict_keys(['SOx', 'NOx'])
{'station1': <pyaro.timeseries.Station.Station object at 0x7ff776cc9d20>, 'station2': <pyaro.timeseries.Station.Station object at 0x7ff776cca6e0>}
The timeseries must be accessed per variable. It will be returned for all stations. The data-columns can be accessed by
keys():
[24]:
var = 'SOx'
ts_data = ts.data(var)
print(ts_data.keys())
ts_data.stations
ts_data.start_times
ts_data.end_times
ts_data.latitudes
ts_data.longitudes
ts_data.altitudes
ts_data.flags
ts_data.values
('values', 'stations', 'latitudes', 'longitudes', 'altitudes', 'start_times', 'end_times', 'flags', 'standard_deviations')
[24]:
array([44.377964 , 73.23672 , 66.83997 , 75.973015 , 54.252964 ,
95.51215 , 43.424374 , 14.8503275, 39.78734 , 84.14651 ,
2.3796806, 56.030033 , 90.70785 , 53.49256 , 33.27008 ,
19.200666 , 16.61291 , 95.239876 , 58.38857 , 25.010443 ,
49.31731 , 95.74444 , 35.146294 , 31.468204 , 70.109985 ,
46.82392 , 44.06993 , 15.679094 , 54.04226 , 42.6484 ,
21.370073 , 37.34375 , 14.086469 , 31.23552 , 12.328813 ,
85.39133 , 96.85262 , 68.06294 , 67.1648 , 27.18295 ,
28.523333 , 1.4397316, 74.56935 , 50.91362 , 34.764988 ,
4.5323606, 29.767143 , 16.157143 , 61.595753 , 57.319874 ,
63.740353 , 4.939785 , 5.5386314, 73.256615 , 18.165173 ,
96.29508 , 20.86049 , 60.049885 , 36.644806 , 70.943375 ,
9.295645 , 1.7138128, 56.983192 , 89.55616 , 13.375153 ,
49.939552 , 31.528936 , 78.00686 , 28.33076 , 16.8259 ,
73.02892 , 96.075714 , 19.514969 , 68.14331 , 21.966438 ,
62.26828 , 82.37647 , 26.558168 , 58.01865 , 56.723133 ,
10.252709 , 7.623141 , 33.05347 , 26.62592 , 41.58915 ,
27.843248 , 85.996025 , 74.1133 , 42.667347 , 43.756298 ,
10.930091 , 15.341663 , 44.52167 , 3.720179 , 88.960014 ,
61.212017 , 93.44711 , 19.978394 , 61.643723 , 85.183685 ,
93.348305 , 97.57919 , 19.217777 , 11.676097 ], dtype=float32)
Conversion to pandas
For pandas users, the timeseries data can be converted to a dataframe:
[25]:
pyaro.timeseries_data_to_pd(ts_data)
[25]:
| values | stations | latitudes | longitudes | altitudes | start_times | end_times | flags | standard_deviations | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 44.377964 | station1 | 10.5 | 172.500000 | 0.0 | 1997-01-01 | 1997-01-02 | 0 | NaN |
| 1 | 73.236717 | station1 | 10.5 | 172.500000 | 0.0 | 1997-01-02 | 1997-01-03 | 0 | NaN |
| 2 | 66.839973 | station1 | 10.5 | 172.500000 | 0.0 | 1997-01-03 | 1997-01-04 | 0 | NaN |
| 3 | 75.973015 | station1 | 10.5 | 172.500000 | 0.0 | 1997-01-04 | 1997-01-05 | 0 | NaN |
| 4 | 54.252964 | station1 | 10.5 | 172.500000 | 0.0 | 1997-01-05 | 1997-01-06 | 0 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 99 | 85.183685 | station2 | 45.5 | -103.199997 | 0.0 | 1997-02-17 | 1997-02-18 | 0 | NaN |
| 100 | 93.348305 | station2 | 45.5 | -103.199997 | 0.0 | 1997-02-18 | 1997-02-19 | 0 | NaN |
| 101 | 97.579193 | station2 | 45.5 | -103.199997 | 0.0 | 1997-02-19 | 1997-02-20 | 0 | NaN |
| 102 | 19.217777 | station2 | 45.5 | -103.199997 | 0.0 | 1997-02-20 | 1997-02-21 | 0 | NaN |
| 103 | 11.676097 | station2 | 45.5 | -103.199997 | 0.0 | 1997-02-21 | 1997-02-22 | 0 | NaN |
104 rows × 9 columns