{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Pyaro basic example\n", "\n", "* Install pyaro and check if installation is new enough:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'0.0.5'" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pyaro\n", "pyaro.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Check a list of installed engines. The most basic installation will install only the `csv_timeseries` engine.\n", "Install e.g. `https://github.com/metno/pyaro-readers` for many more engines." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'csv_timeseries': }" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pyaro.list_timeseries_engines()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Learn a bit about the engine." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on CSVTimeseriesEngine in module pyaro.csvreader.CSVTimeseriesReader object:\n", "\n", "class CSVTimeseriesEngine(pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine)\n", " | Method resolution order:\n", " | CSVTimeseriesEngine\n", " | pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine\n", " | pyaro.timeseries.Engine.Engine\n", " | abc.ABC\n", " | builtins.object\n", " | \n", " | Methods defined here:\n", " | \n", " | description(self)\n", " | Get a descriptive string about this pyaro implementation.\n", " | \n", " | open(self, filename, *args, **kwargs) -> pyaro.csvreader.CSVTimeseriesReader.CSVTimeseriesReader\n", " | open-function of the timeseries, initializing the reader-object, i.e.\n", " | equivalent to Reader(filename_or_object_or_url, *, filters)\n", " | \n", " | :return pyaro.timeseries.Reader\n", " | :raises UnknownFilterException\n", " | \n", " | reader_class(self)\n", " | return the class of the corresponding reader\n", " | \n", " | :return: the class returned from open\n", " | \n", " | url(self)\n", " | Get a url about more information, docs of the datasource-engine.\n", " | \n", " | This should be the github-url or similar of the implementation.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data and other attributes defined here:\n", " | \n", " | __abstractmethods__ = frozenset()\n", " | \n", " | ----------------------------------------------------------------------\n", " | Methods inherited from pyaro.timeseries.AutoFilterReaderEngine.AutoFilterEngine:\n", " | \n", " | args(self)\n", " | return a tuple of parameters to be passed to open_timeseries, including\n", " | the mandatory filename_or_obj_or_url parameter.\n", " | \n", " | supported_filters(self) -> []\n", " | The supported filters by this Engine. Maps to the Readers supported_filters.\n", " | \n", " | :return: a list of filters\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors inherited from pyaro.timeseries.Engine.Engine:\n", " | \n", " | __dict__\n", " | dictionary for instance variables (if defined)\n", " | \n", " | __weakref__\n", " | list of weak references to the object (if defined)\n", "\n" ] } ], "source": [ "pr_csv = pyaro.list_timeseries_engines()['csv_timeseries']\n", "help(pr_csv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Check the description and the open-arguments to open a database with this engine:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Simple reader of csv-files using python csv-reader\n", "('filename', 'columns', 'variable_units', 'country_lookup', 'csvreader_kwargs', 'filters')\n" ] } ], "source": [ "print(pr_csv.description())\n", "print(pr_csv.args())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Opening a datasource with an engine\n", "\n", "Open now the timeseries `ts` with a table. You could do that with a `with` clause in larger code, \n", "but for simplicity, we don't do that here. `columns` map the files columns to the data, starting\n", "with first column as 0, which contains the variable-name in our example file.\n", "\n", "The test-file is read using the python `csv` module. `csvreader_kwargs` sets up that module, i.e.\n", "comma-separated setting the delimiter." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "file = \"../../tests/testdata/csvReader_testdata.csv\"\n", "columns = {\n", " \"variable\": 0,\n", " \"station\": 1,\n", " \"longitude\": 2,\n", " \"latitude\": 3,\n", " \"value\": 4,\n", " \"units\": 5,\n", " \"start_time\": 6,\n", " \"end_time\": 7,\n", " \"altitude\": \"0\",\n", " \"country\": \"NO\",\n", " \"standard_deviation\": \"NaN\",\n", " \"flag\": \"0\",\n", " }\n", "csvreader_kwargs = {\"delimiter\": \",\"}\n", "\n", "ts = pyaro.open_timeseries('csv_timeseries',\n", " filename=file,\n", " columns=columns,\n", " csvreader_kwargs=csvreader_kwargs,\n", " filters=[])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`ts` is now the handle to the data-source.\n", "\n", "* Accessing metadata in the datasource, like available variables and stations" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "dict_keys(['SOx', 'NOx'])\n", "{'station1': , 'station2': }\n" ] } ], "source": [ "print(ts.variables())\n", "print(ts.stations())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The timeseries must be accessed per variable. It will be returned for all\n", "stations. The data-columns can be accessed by `keys()`:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('values', 'stations', 'latitudes', 'longitudes', 'altitudes', 'start_times', 'end_times', 'flags', 'standard_deviations')\n" ] }, { "data": { "text/plain": [ "array([44.377964 , 73.23672 , 66.83997 , 75.973015 , 54.252964 ,\n", " 95.51215 , 43.424374 , 14.8503275, 39.78734 , 84.14651 ,\n", " 2.3796806, 56.030033 , 90.70785 , 53.49256 , 33.27008 ,\n", " 19.200666 , 16.61291 , 95.239876 , 58.38857 , 25.010443 ,\n", " 49.31731 , 95.74444 , 35.146294 , 31.468204 , 70.109985 ,\n", " 46.82392 , 44.06993 , 15.679094 , 54.04226 , 42.6484 ,\n", " 21.370073 , 37.34375 , 14.086469 , 31.23552 , 12.328813 ,\n", " 85.39133 , 96.85262 , 68.06294 , 67.1648 , 27.18295 ,\n", " 28.523333 , 1.4397316, 74.56935 , 50.91362 , 34.764988 ,\n", " 4.5323606, 29.767143 , 16.157143 , 61.595753 , 57.319874 ,\n", " 63.740353 , 4.939785 , 5.5386314, 73.256615 , 18.165173 ,\n", " 96.29508 , 20.86049 , 60.049885 , 36.644806 , 70.943375 ,\n", " 9.295645 , 1.7138128, 56.983192 , 89.55616 , 13.375153 ,\n", " 49.939552 , 31.528936 , 78.00686 , 28.33076 , 16.8259 ,\n", " 73.02892 , 96.075714 , 19.514969 , 68.14331 , 21.966438 ,\n", " 62.26828 , 82.37647 , 26.558168 , 58.01865 , 56.723133 ,\n", " 10.252709 , 7.623141 , 33.05347 , 26.62592 , 41.58915 ,\n", " 27.843248 , 85.996025 , 74.1133 , 42.667347 , 43.756298 ,\n", " 10.930091 , 15.341663 , 44.52167 , 3.720179 , 88.960014 ,\n", " 61.212017 , 93.44711 , 19.978394 , 61.643723 , 85.183685 ,\n", " 93.348305 , 97.57919 , 19.217777 , 11.676097 ], dtype=float32)" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "var = 'SOx'\n", "ts_data = ts.data(var)\n", "print(ts_data.keys())\n", "ts_data.stations\n", "ts_data.start_times\n", "ts_data.end_times\n", "ts_data.latitudes\n", "ts_data.longitudes\n", "ts_data.altitudes\n", "ts_data.flags\n", "ts_data.values\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conversion to pandas\n", "\n", "For pandas users, the timeseries data can be converted to a dataframe:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
valuesstationslatitudeslongitudesaltitudesstart_timesend_timesflagsstandard_deviations
044.377964station110.5172.5000000.01997-01-011997-01-020NaN
173.236717station110.5172.5000000.01997-01-021997-01-030NaN
266.839973station110.5172.5000000.01997-01-031997-01-040NaN
375.973015station110.5172.5000000.01997-01-041997-01-050NaN
454.252964station110.5172.5000000.01997-01-051997-01-060NaN
..............................
9985.183685station245.5-103.1999970.01997-02-171997-02-180NaN
10093.348305station245.5-103.1999970.01997-02-181997-02-190NaN
10197.579193station245.5-103.1999970.01997-02-191997-02-200NaN
10219.217777station245.5-103.1999970.01997-02-201997-02-210NaN
10311.676097station245.5-103.1999970.01997-02-211997-02-220NaN
\n", "

104 rows × 9 columns

\n", "
" ], "text/plain": [ " values stations latitudes longitudes altitudes start_times \\\n", "0 44.377964 station1 10.5 172.500000 0.0 1997-01-01 \n", "1 73.236717 station1 10.5 172.500000 0.0 1997-01-02 \n", "2 66.839973 station1 10.5 172.500000 0.0 1997-01-03 \n", "3 75.973015 station1 10.5 172.500000 0.0 1997-01-04 \n", "4 54.252964 station1 10.5 172.500000 0.0 1997-01-05 \n", ".. ... ... ... ... ... ... \n", "99 85.183685 station2 45.5 -103.199997 0.0 1997-02-17 \n", "100 93.348305 station2 45.5 -103.199997 0.0 1997-02-18 \n", "101 97.579193 station2 45.5 -103.199997 0.0 1997-02-19 \n", "102 19.217777 station2 45.5 -103.199997 0.0 1997-02-20 \n", "103 11.676097 station2 45.5 -103.199997 0.0 1997-02-21 \n", "\n", " end_times flags standard_deviations \n", "0 1997-01-02 0 NaN \n", "1 1997-01-03 0 NaN \n", "2 1997-01-04 0 NaN \n", "3 1997-01-05 0 NaN \n", "4 1997-01-06 0 NaN \n", ".. ... ... ... \n", "99 1997-02-18 0 NaN \n", "100 1997-02-19 0 NaN \n", "101 1997-02-20 0 NaN \n", "102 1997-02-21 0 NaN \n", "103 1997-02-22 0 NaN \n", "\n", "[104 rows x 9 columns]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pyaro.timeseries_data_to_pd(ts_data)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 2 }