Postprocess¶

Data preparation¶

This module is a collection of methods that are needed to convert the raw data to a same format.

ninolearn.postprocess.prepare.calc_warm_pool_edge()[source]¶: calculate the warm pool edge

ninolearn.postprocess.prepare.prep_K_index()[source]¶: function that edits the Kirimati index from Bunge and Clarke (2014)

ninolearn.postprocess.prepare.prep_iod()[source]¶: Prepare the IOD index dataframe

ninolearn.postprocess.prepare.prep_nino_month(index='3.4', detrend=False)[source]¶: Add a time axis corresponding to the first day of the central month.

ninolearn.postprocess.prepare.prep_oni()[source]¶: Add a time axis corresponding to the first day of the central month of a 3-month season. For example: DJF 2019 becomes 2019-01-01. Further, rename some axis.

ninolearn.postprocess.prepare.prep_other_forecasts()[source]¶: Get other forecasts into a decent format.

ninolearn.postprocess.prepare.prep_wwv(cardinal_direction='')[source]¶: Add a time axis corresponding to the first day of the central month of a 3-month season. For example: DJF 2019 becomes 2019-01-01. Further, rename some axis.

ninolearn.postprocess.prepare.prep_wwv_proxy()[source]¶: Make a wwv proxy index that uses the K-index from Bunge and Clarke (2014) for the time period between 1955 and 1979

ninolearn.postprocess.prepare.season_shift_year(season)[source]¶

when the function .season_to_month() is applied the year related to NDJ needs to be shifted by 1.

Parameters: season (string) – Season represented by three letters such as ‘DJF’

ninolearn.postprocess.prepare.season_to_month(season)[source]¶

translates a 3-month season string to the corresponding integer of the last month of the season (to ensure not to include any future information when predictions are made later with this data)

Parameters: season (string) – Season represented by three letters such as ‘DJF’

This module contains a bunch of mehtods to compute seasonal anomalies.

Currently the reference period is 1981-2010.

ninolearn.postprocess.anomaly.computeAnomaly(data)[source]¶: Remove the seasonality

ninolearn.postprocess.anomaly.computeMeanClimatology(data)[source]¶: Monthly means

ninolearn.postprocess.anomaly.computeNormAnomaly(data)[source]¶: Remove the seasonality

ninolearn.postprocess.anomaly.computeStdClimatology(data)[source]¶: Monthly stds

ninolearn.postprocess.anomaly.postprocess(data, new=False, ref_period=True)[source]¶

Combine all the postprocessing functions in one data routine.

Parameters

data – xarray data array
new – compute the statistics again (default = False)

ninolearn.postprocess.anomaly.saveAnomaly(data, new, compute=True)[source]¶: save deviation to postdir

ninolearn.postprocess.anomaly.saveNormAnomaly(data, new)[source]¶: save deviation to postdir

ninolearn.postprocess.anomaly.toPostDir(data, new)[source]¶: Save the basic data to the postdir.

ninolearn.postprocess.regrid.to2_5x2_5(data)[source]¶

Regrids data the 2.5x2.5 from the NCEP reanalysis data set.

Parameters: data – An xarray dataArray or DataSet with with dimensions named ‘lat’ and ‘lon’.

Evolving complex networks¶

class ninolearn.postprocess.network.climateNetwork(*args, **kwds)[source]¶

Child object of the igraph.Graph class for the construction of a complex climate network.

cluster_fraction(size=2)[source]¶

Returns the fraction of the nodes that are part of a cluster of the given size (default: size=2).

Parameters: size (int) – Size of the cluster. Default:2.

corrected_hamming_distance(other_adjacency)[source]¶

Compute the Hamming distance of the climate Network to the provided other Network (by supplying the adjacency of the other Network). Computation is done as described in Radebach et al. (2013).

Parameters: other_adjacency – The adjacency of the other climate Network.

classmethod from_adjacency(adjacency)[source]¶

Generate an igraph network form a adjacency matrix.

Parameters: adjacency (np.ndarray) – The NxN adjacency array.

classmethod from_correalation_matrix(correalation_matrix, threshold=None, edge_density=None)[source]¶

generate an igraph network from a correlation matrix

Parameters

correalation_matrix – The NxN correlation matrix that should be used to generate the network.
threshold – If NOT none but float between 0 and 1, a network with a fixed global threshold is generated.

NOTE: EITHER the threshold OR the edge density method can be used!

Parameters: edge_density – If NOT none but float between 0 and 1, a network with a fixed edge density where the strongest links are part of network is generated. Note, EITHER the threshold OR the edge density method can be used!

giant_fraction()[source]¶: Returns the fraction of the nodes that are part of the giant component.

hamming_distance(other_adjacency)[source]¶

Compute the Hamming distance of the climate Network to the provided other Network (by supplying the adjacency of the other Network).

Parameters: other_adjacency – The adjacency of the other climate Network.

class ninolearn.postprocess.network.networkMetricsSeries(variable, dataset, processed='anom', threshold=None, edge_density=None, startyear=1948, endyear=2018, window_size=12, lon_min=120, lon_max=260, lat_min=-30, lat_max=30, verbose=0)[source]¶

Class for the computation of network metrics time series

Parameters

variable (str) – The variable for which the network time series should be computed.
dataset (str) – The dataset that should be used to build the network.
processed (str) – Either ‘’,’anom’ or ‘normanom’.
threshold (float) – the threshold for a the correlation coeficent between two grid point to be considered as connected.
startyear – The first year for which the network analysis should be done.
endyear – The last year for which the network analysis should be done.
window_size – The size of the window for which the network metrics are computed.
lon_min,lon_max – The minimum and the maximum values of the longitude grid for which the metrics shell be computed (from 0 to 360 degrees east).
lat_min,lat_max – The minimum and the maximum values of the latitude grid for which the metrics shell be computed (from -180 to 180 degrees east).

computeNetworkMetrics(corrcoef)[source]¶

computes network metrics from a correlation matrix in combination with the already given threshold

Parameters: corrcoef – The correlation matrix.

computeTimeSeries()[source]¶

Compute the evolving complex network timeseries, the corresping metrics and save the results to a csv-file in the data directory

NOTE: Specify the data directory as ‘datadir’ in the ninolear.private module which you may not push the public repository.

initalizeSeries()[source]¶: initializes the pandas Series and array that saves the adjacency of the network from the previous time step.

Principal Component Analysis¶

class ninolearn.postprocess.pca.pca(n_components=None, copy=True, whiten=False, svd_solver='auto', tol=0.0, iterated_power='auto', random_state=None)[source]¶

This class extends the PCA class of the sklearn.decomposition.pca module. It facilitates the loading of the data from the postprocessed directory, wraps the fit function of the PCA class, has a saving routine for the computed pca component and can plot the EOF to get more insight into the results.

component_map_(eof=1)[source]¶

Returns the components as a map.

Parameters: eof – The leading eof (default:1).

compute_pca()[source]¶: Simple wrapper around the PCA.fit() method.

load_data(variable, dataset, processed='anom', startyear=1949, endyear=2018, lon_min=120, lon_max=280, lat_min=-30, lat_max=30)[source]¶

Load data for PCA analysis from the desired postprocessed data set.

Parameters

variable (str) – The variable for which the PCA will be done.
dataset (str) – The data set that should be used for the PCA.
processed (str) – Either ‘’,’anom’ or ‘normanom’.
startyear – The start year for the time series for which the PCA is done.
endyear – The last year for the time series for which the PCA is done.
lon_min,lon_max – The minimum and the maximum values of the longitude grid for which the metrics shell be computed (from 0 to 360 degrees east)
lat_min,lat_max – The min and the max values of the latitude grid for which the metrics shell be computed (from -180 to 180 degrees east)

pc_projection(eof=1)[source]¶

Returns the amplitude timeseries of the specified eof.

Parameters: eof – The nth leading eof (default:1).

plot_eof()[source]¶: Make a plot for the first leading EOFs.

save(extension='', filename=None)[source]¶: Saves the first three pca components to a csv-file.

set_eof_array(data)[source]¶: Genrates the array that will be analyzed with the EOF.