Data preparation and read postprocessed data

Do you remember from the Download and read raw data tutorial, that the ONI had a wired time axis? As a little reminder:

[1]:
from ninolearn.IO.read_raw import oni, wwv_anom

ONI = oni()
print(ONI.head())

WWV = wwv_anom()
print(WWV.head())
  SEAS    YR  TOTAL  ANOM
0  DJF  1950  24.72 -1.53
1  JFM  1950  25.17 -1.34
2  FMA  1950  25.75 -1.16
3  MAM  1950  26.12 -1.18
4  AMJ  1950  26.32 -1.07
     date        Volume       Anomaly
0  198001  2.605404e+15  7.657363e+13
1  198002  2.564434e+15  7.004931e+13
2  198003  2.514065e+15  5.240853e+13
3  198004  2.468250e+15  4.008869e+13
4  198005  2.439852e+15  4.020975e+13

This time axis is difficult to work with. For this NinoLearn contains a postprocessing method that fixes this for you.

[2]:
from ninolearn.preprocess.prepare import prep_oni, prep_wwv, prep_iod

prep_oni()
prep_wwv()
prep_iod()
Prepare ONI timeseries.
Prepare WWV  timeseries.
Prepare IOD timeseries.

All methods from the postprocess sub-package save the data directly to the data directory postdir that you need to specify in ninolean.pathes.

Now, lets read this data using the data reader for postprocessed data.

[4]:
from ninolearn.IO.read_processed import data_reader

reader = data_reader()
reader = data_reader(startdate='1980-01', enddate='2017-12')

# read from a output csv and choose the anomaly  (processed='anom') data
oni_anom_postprocessed = reader.read_csv('oni', processed='anom')
print(oni_anom_postprocessed.head())

wwv_anom_postprocessed = reader.read_csv('wwv', processed='anom')
print(wwv_anom_postprocessed.head())

iod_anom_postprocessed = reader.read_csv('iod', processed='anom')
print(iod_anom_postprocessed.head())
time
1980-01-01    0.64
1980-02-01    0.59
1980-03-01    0.46
1980-04-01    0.34
1980-05-01    0.38
Name: anom, dtype: float64
time
1980-01-01    7.657363e+13
1980-02-01    7.004931e+13
1980-03-01    5.240853e+13
1980-04-01    4.008869e+13
1980-05-01    4.020975e+13
Name: anom, dtype: float64
time
1980-01-01    0.025
1980-02-01   -0.021
1980-03-01   -0.251
1980-04-01    0.103
1980-05-01    0.148
Name: anom, dtype: float64

Now, the data comes in a clean format. Note that the dates to which seasonal value are assigend are the first day of the last month of the three-month season (e.g. JFM 1950 becomes 1950-03-01). This is because throughout NinoLearn only monthly data is used and all monthly data is assigned to the first date of the month. Seasonal data is assigned to the last month of a season to ensure that prediction schemes do NOT accidently include data from future periods.

Further preparation methods are available in the ninolearn.postprocess.prepare module for other raw data sets.