Preprocess¶
Data preparation¶
This module is a collection of methods that are needed to convert the raw data to a same format.
-
ninolearn.preprocess.prepare.
prep_K_index
()[source]¶ function that edits the Kirimati index from Bunge and Clarke (2014)
-
ninolearn.preprocess.prepare.
prep_nino_month
(index='3.4', detrend=False)[source]¶ Add a time axis corresponding to the first day of the central month.
-
ninolearn.preprocess.prepare.
prep_oni
()[source]¶ Add a time axis corresponding to the first day of the central month of a 3-month season. For example: DJF 2019 becomes 2019-01-01. Further, rename some axis.
-
ninolearn.preprocess.prepare.
prep_other_forecasts
()[source]¶ Get other forecasts into a decent format.
-
ninolearn.preprocess.prepare.
prep_wwv
(cardinal_direction='')[source]¶ Add a time axis corresponding to the first day of the central month of a 3-month season. For example: DJF 2019 becomes 2019-01-01. Further, rename some axis.
-
ninolearn.preprocess.prepare.
prep_wwv_proxy
()[source]¶ Make a wwv proxy index that uses the K-index from Bunge and Clarke (2014) for the time period between 1955 and 1979
-
ninolearn.preprocess.prepare.
season_shift_year
(season)[source]¶ when the function .season_to_month() is applied the year related to NDJ needs to be shifted by 1.
- Parameters
season (string) – Season represented by three letters such as ‘DJF’
-
ninolearn.preprocess.prepare.
season_to_month
(season)[source]¶ translates a 3-month season string to the corresponding integer of the last month of the season (to ensure not to include any future information when predictions are made later with this data)
- Parameters
season (string) – Season represented by three letters such as ‘DJF’
Regrid data¶
Compute anomalies¶
This module contains a bunch of mehtods to compute seasonal anomalies.
Currently the reference period is 1981-2010.
-
ninolearn.preprocess.anomaly.
postprocess
(data, new=False, ref_period=True)[source]¶ Combine all the postprocessing functions in one data routine.
- Parameters
data – xarray data array
new – compute the statistics again (default = False)
Evolving complex networks¶
-
class
ninolearn.preprocess.network.
climateNetwork
(*args, **kwds)[source]¶ Child object of the igraph.Graph class for the construction of a complex climate network.
-
cluster_fraction
(size=2)[source]¶ Returns the fraction of the nodes that are part of a cluster of the given size (default: size=2).
- Parameters
size (int) – Size of the cluster. Default:2.
-
corrected_hamming_distance
(other_adjacency)[source]¶ Compute the Hamming distance of the climate Network to the provided other Network (by supplying the adjacency of the other Network). Computation is done as described in Radebach et al. (2013).
- Parameters
other_adjacency – The adjacency of the other climate Network.
-
classmethod
from_adjacency
(adjacency)[source]¶ Generate an igraph network form a adjacency matrix.
- Parameters
adjacency (np.ndarray) – The NxN adjacency array.
-
classmethod
from_correalation_matrix
(correalation_matrix, threshold=None, edge_density=None)[source]¶ generate an igraph network from a correlation matrix
- Parameters
correalation_matrix – The NxN correlation matrix that should be used to generate the network.
threshold – If NOT none but float between 0 and 1, a network with a fixed global threshold is generated.
NOTE: EITHER the threshold OR the edge density method can be used!
- Parameters
edge_density – If NOT none but float between 0 and 1, a network with a fixed edge density where the strongest links are part of network is generated. Note, EITHER the threshold OR the edge density method can be used!
-
-
class
ninolearn.preprocess.network.
networkMetricsSeries
(variable, dataset, processed='anom', threshold=None, edge_density=None, startyear=1948, endyear=2018, window_size=12, lon_min=120, lon_max=260, lat_min=-30, lat_max=30, verbose=0)[source]¶ Class for the computation of network metrics time series
- Parameters
variable (str) – The variable for which the network time series should be computed.
dataset (str) – The dataset that should be used to build the network.
processed (str) – Either ‘’,’anom’ or ‘normanom’.
threshold (float) – the threshold for a the correlation coeficent between two grid point to be considered as connected.
startyear – The first year for which the network analysis should be done.
endyear – The last year for which the network analysis should be done.
window_size – The size of the window for which the network metrics are computed.
lon_min,lon_max – The minimum and the maximum values of the longitude grid for which the metrics shell be computed (from 0 to 360 degrees east).
lat_min,lat_max – The minimum and the maximum values of the latitude grid for which the metrics shell be computed (from -180 to 180 degrees east).
-
computeNetworkMetrics
(corrcoef)[source]¶ computes network metrics from a correlation matrix in combination with the already given threshold
- Parameters
corrcoef – The correlation matrix.
Principal Component Analysis¶
-
class
ninolearn.preprocess.pca.
pca
(n_components=None, copy=True, whiten=False, svd_solver='auto', tol=0.0, iterated_power='auto', random_state=None)[source]¶ This class extends the PCA class of the sklearn.decomposition.pca module. It facilitates the loading of the data from the postprocessed directory, wraps the fit function of the PCA class, has a saving routine for the computed pca component and can plot the EOF to get more insight into the results.
-
component_map_
(eof=1)[source]¶ Returns the components as a map.
- Parameters
eof – The leading eof (default:1).
-
load_data
(variable, dataset, processed='anom', startyear=1949, endyear=2018, lon_min=120, lon_max=280, lat_min=-30, lat_max=30)[source]¶ Load data for PCA analysis from the desired postprocessed data set.
- Parameters
variable (str) – The variable for which the PCA will be done.
dataset (str) – The data set that should be used for the PCA.
processed (str) – Either ‘’,’anom’ or ‘normanom’.
startyear – The start year for the time series for which the PCA is done.
endyear – The last year for the time series for which the PCA is done.
lon_min,lon_max – The minimum and the maximum values of the longitude grid for which the metrics shell be computed (from 0 to 360 degrees east)
lat_min,lat_max – The min and the max values of the latitude grid for which the metrics shell be computed (from -180 to 180 degrees east)
-