Download and read raw data¶
Download¶
Before you download data, make sure that you specific the pass to the raw data directory (rawdir
) in ninolearn.pathes
.
In this tutorial, we download the monthly Oceaninc Nino Index, the Warm Water Volume (WWV), the dipole mode index (DMI) of the Indian ocean dipole (IOD) and sea surface temparatures from the ERSSTv5 data set and sea surface heights from the ORAS4 data set.
[1]:
# import the necessary methods and classes
from ninolearn.download import download, sources
NOTE: If the data was already downloaded, it won’t be downloaded again.
[8]:
download(sources.SST_ERSSTv5)
download(sources.ONI)
download(sources.IOD)
download(sources.WWV)
download(sources.SAT_monthly_NCEP)
sst.mnmean.nc already downloaded
oni.txt already downloaded
iod.txt already downloaded
wwv.dat already downloaded
Download air.mon.mean.nc
The sources all are dictionaries which have keywords that specify the download:
[3]:
print(sources.SST_ERSSTv5)
{'downloadType': 'ftp', 'filename': 'sst.mnmean.nc', 'host': 'ftp.cdc.noaa.gov', 'location': '/Datasets/noaa.ersst.v5/'}
[4]:
print(sources.ONI)
{'downloadType': 'http', 'url': 'https://www.cpc.ncep.noaa.gov/data/indices/oni.ascii.txt', 'filename': 'oni.txt'}
You can see that that the two sources above have entries different downloadTypes. The SST is downloaded from an ftp-server, whereas the ONI is downloaded via http.
Read raw data¶
Routines are available in the ninolearn.IO.read_raw
module with which it is directly possible to read the raw data as it is.
[10]:
from ninolearn.IO.read_raw import oni, sst_ERSSTv5, sat, wwv_anom, iod
ONI = oni()
SST = sst_ERSSTv5()
SAT = sat(mean='monthly')
WWV = wwv_anom()
Let’s have a look how the raw data looks like!
[11]:
ONI.head()
[11]:
SEAS | YR | TOTAL | ANOM | |
---|---|---|---|---|
0 | DJF | 1950 | 24.72 | -1.53 |
1 | JFM | 1950 | 25.17 | -1.34 |
2 | FMA | 1950 | 25.75 | -1.16 |
3 | MAM | 1950 | 26.12 | -1.18 |
4 | AMJ | 1950 | 26.32 | -1.07 |
[5]:
WWV.head()
[5]:
date | Volume | Anomaly | |
---|---|---|---|
0 | 198001 | 2.605404e+15 | 7.657363e+13 |
1 | 198002 | 2.564434e+15 | 7.004931e+13 |
2 | 198003 | 2.514065e+15 | 5.240853e+13 |
3 | 198004 | 2.468250e+15 | 4.008869e+13 |
4 | 198005 | 2.439852e+15 | 4.020975e+13 |
[20]:
print(SST)
<xarray.DataArray 'sst' (time: 1982, lat: 89, lon: 180)>
[31751640 values with dtype=float32]
Coordinates:
* lat (lat) float32 88.0 86.0 84.0 82.0 80.0 ... -82.0 -84.0 -86.0 -88.0
* lon (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 350.0 352.0 354.0 356.0 358.0
* time (time) datetime64[ns] 1854-01-01 1854-02-01 ... 2019-02-01
Attributes:
long_name: Monthly Means of Sea Surface Temperature
units: degC
var_desc: Sea Surface Temperature
level_desc: Surface
statistic: Mean
dataset: ERSSTv5
parent_stat: Individual Values
actual_range: [-1.8 42.32636]
valid_range: [-1.8 45. ]
[21]:
print(SAT)
<xarray.DataArray 'air' (time: 854, lat: 73, lon: 144)>
[8977248 values with dtype=float32]
Coordinates:
* lat (lat) float32 90.0 87.5 85.0 82.5 80.0 ... -82.5 -85.0 -87.5 -90.0
* lon (lon) float32 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
* time (time) datetime64[ns] 1948-01-01 1948-02-01 ... 2019-02-01
Attributes:
long_name: Monthly Mean Air Temperature at sigma level 0.995
valid_range: [-2000. 2000.]
units: degC
precision: 1
var_desc: Air Temperature
level_desc: Surface
statistic: Mean
parent_stat: Individual Obs
dataset: NCEP
actual_range: [-73.78001 42.14595]
As you can see, the data sets do not have a common time axis, are available for different time periods and on different grids.
To bring the different data sets onto a common/standardized shape, check out the tutorials on preparing the data and postprocessing it.