{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Data preparation and read postprocessed data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do you remember from the [Download and read raw data](download_and_read_raw_data.ipynb) tutorial, that the ONI had a wired time axis? As a little reminder:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " SEAS YR TOTAL ANOM\n", "0 DJF 1950 24.72 -1.53\n", "1 JFM 1950 25.17 -1.34\n", "2 FMA 1950 25.75 -1.16\n", "3 MAM 1950 26.12 -1.18\n", "4 AMJ 1950 26.32 -1.07\n", " date Volume Anomaly\n", "0 198001 2.605404e+15 7.657363e+13\n", "1 198002 2.564434e+15 7.004931e+13\n", "2 198003 2.514065e+15 5.240853e+13\n", "3 198004 2.468250e+15 4.008869e+13\n", "4 198005 2.439852e+15 4.020975e+13\n" ] } ], "source": [ "from ninolearn.IO.read_raw import oni, wwv_anom\n", "\n", "ONI = oni()\n", "print(ONI.head())\n", "\n", "WWV = wwv_anom()\n", "print(WWV.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This time axis is difficult to work with. For this NinoLearn contains a postprocessing method that fixes this for you. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Prepare ONI timeseries.\n", "Prepare WWV timeseries.\n", "Prepare IOD timeseries.\n" ] } ], "source": [ "from ninolearn.preprocess.prepare import prep_oni, prep_wwv, prep_iod\n", "\n", "prep_oni()\n", "prep_wwv()\n", "prep_iod()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All methods from the `postprocess` sub-package save the data directly to the data directory `postdir` that you need to specify in `ninolean.pathes`.\n", "\n", "Now, lets read this data using the data reader for postprocessed data." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "time\n", "1980-01-01 0.64\n", "1980-02-01 0.59\n", "1980-03-01 0.46\n", "1980-04-01 0.34\n", "1980-05-01 0.38\n", "Name: anom, dtype: float64\n", "time\n", "1980-01-01 7.657363e+13\n", "1980-02-01 7.004931e+13\n", "1980-03-01 5.240853e+13\n", "1980-04-01 4.008869e+13\n", "1980-05-01 4.020975e+13\n", "Name: anom, dtype: float64\n", "time\n", "1980-01-01 0.025\n", "1980-02-01 -0.021\n", "1980-03-01 -0.251\n", "1980-04-01 0.103\n", "1980-05-01 0.148\n", "Name: anom, dtype: float64\n" ] } ], "source": [ "from ninolearn.IO.read_processed import data_reader\n", "\n", "reader = data_reader()\n", "reader = data_reader(startdate='1980-01', enddate='2017-12')\n", "\n", "# read from a output csv and choose the anomaly (processed='anom') data\n", "oni_anom_postprocessed = reader.read_csv('oni', processed='anom')\n", "print(oni_anom_postprocessed.head())\n", "\n", "wwv_anom_postprocessed = reader.read_csv('wwv', processed='anom')\n", "print(wwv_anom_postprocessed.head())\n", "\n", "iod_anom_postprocessed = reader.read_csv('iod', processed='anom')\n", "print(iod_anom_postprocessed.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, the data comes in a clean format. Note that the dates to which seasonal value are assigend are the first day of the last month of the three-month season (e.g. JFM 1950 becomes 1950-03-01). This is because throughout NinoLearn only monthly data is used and all monthly data is assigned to the first date of the month. Seasonal data is assigned to the last month of a season to ensure that prediction schemes do NOT accidently include data from future periods.\n", "\n", "Further preparation methods are available in the `ninolearn.postprocess.prepare` module for other raw data sets." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }