dingo.core.nn package

Submodules

dingo.core.nn.cfnets module

class dingo.core.nn.cfnets.ContinuousFlow(continuous_flow_net: Module, context_embedding_net: Module = Identity(), theta_embedding_net: Module = Identity(), context_with_glu: bool = False, theta_with_glu: bool = False)

Bases: Module

A continuous normalizing flow network. It defines a time-dependent vector field on the parameter space (score or flow), which optionally depends on additional context information.

v = v(f(t, theta), g(context))

This class combines the network v for the continuous flow itself, as well as embedding networks f, g, for the context and parameters, respectively.

The parameters and context can optionally be provided as gated linear unit (GLU) context to the main network, rather than as the main input to the network. For a DenseResidualNet, this context is input repeatedly via GLUs, for each residual block.

Parameters:

continuous_flow_net (nn.Module) – Main network for the continuous flow.
context_embedding_net (nn.Module = torch.nn.Identity()) – Embedding network for the context information (e.g., observed data).
theta_embedding_net (nn.Module = torch.nn.Identity()) – Embedding network for the parameters.
context_with_glu (bool = False) – Whether to provide context as GLU or main input to the continuous_flow_net.
theta_with_glu (bool = False) – Whether to provide theta (and t) as GLU or main input to the continuous_flow_net.

forward(t, theta, *context)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

property use_cache

class dingo.core.nn.cfnets.PositionalEncoding(nr_frequencies, encode_all=True, base_freq=6.283185307179586)

Bases: Module

Implements positional encoding as commonly used in transformer architectures.

Positional encoding introduces a way to inject information about the order of the input data (e.g., sequence positions) into a neural network that otherwise lacks a sense of position due to its permutation-invariant nature. This class computes sinusoidal encodings based on the position of each element in the input and concatenates them with the original input features.

frequencies

A tensor containing the frequencies used to calculate the sinusoidal components. The frequencies are powers of 2, scaled by the base frequency.

Type:: torch.Tensor

encode_all

Determines whether the positional encoding is applied to all features of the input or only the first feature (e.g., time component).

Type:: bool

base_freq

The base frequency used to scale the sinusoidal components, defaulting to 2 * pi.

Type:: float

Parameters:

nr_frequencies (int) – The number of sinusoidal frequencies to compute. This determines the dimensionality of the positional encoding for each input feature.
encode_all (bool, optional (default=True)) – If True, the positional encoding is computed for all features in the input. Otherwise, it is computed only for the first feature (e.g., the time dimension).
base_freq (float, optional (default=2 * np.pi)) – The base frequency used for sinusoidal encoding.

forward(t_theta): Computes the positional encoding for the input tensor t_theta and concatenates it with the original input features. - If encode_all is True, the positional encoding is computed for all features. - If encode_all is False, the positional encoding is applied only to the first

feature, such as time, while other features remain unchanged.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(t_theta)

Computes and concatenates positional encodings with the input tensor.

Parameters:

t_theta (torch.Tensor) – Input tensor of shape (batch_size, input_dim), where input_dim is the dimensionality of the input features.

Returns:

A tensor containing the input features concatenated with the positional encodings. The output shape will be: - (batch_size, input_dim + 2 * nr_frequencies) if encode_all is True. - (batch_size, input_dim + 2 * nr_frequencies) if encode_all is False,

but positional encodings are computed only for the first input feature.

Return type:

torch.Tensor

dingo.core.nn.cfnets.create_cf(posterior_kwargs: dict, embedding_kwargs: dict | None = None, initial_weights: dict | None = None)

Build a continuous flow based on settings dictionaries.

Parameters:

posterior_kwargs (dict) – Settings for the flow. This includes the settings for the parameter embedding.
embedding_kwargs (dict) – Settings for the context embedding network.
initial_weights (dict) – Initial weights for the embedding network (of SVD projection type).

Returns:

Neural network for the continuous flow.

Return type:

nn.Module

dingo.core.nn.cfnets.get_dim_positional_embedding(encoding: dict, input_dim: int)

dingo.core.nn.cfnets.get_theta_embedding_net(embedding_kwargs: dict, input_dim)

dingo.core.nn.enets module

Implementation of embedding networks.

class dingo.core.nn.enets.DenseResidualNet(input_dim: int, output_dim: int, hidden_dims: ~typing.Tuple, activation: ~typing.Callable = <function elu>, dropout: float = 0.0, batch_norm: bool = True, context_features: int | None = None)

Bases: Module

A nn.Module consisting of a sequence of dense residual blocks. This is used to embed high dimensional input to a compressed output. Linear resizing layers are used for resizing the input and output to match the first and last hidden dimension, respectively.

Module specs

input dimension: (batch_size, input_dim) output dimension: (batch_size, output_dim)

param input_dim:: dimension of the input to this module
type input_dim:: int
param output_dim:: output dimension of this module
type output_dim:: int
param hidden_dims:: tuple with dimensions of hidden layers of this module
type hidden_dims:: tuple
param activation:: activation function used in residual blocks
type activation:: callable
param dropout:: dropout probability for residual blocks used for reqularization
type dropout:: float
param batch_norm:: flag that specifies whether to use batch normalization
type batch_norm:: bool
param context_features:: Number of additional context features, which are provided to the residual blocks via gated linear units. If None, no additional context expected.
type context_features:: int

forward(x, context=None)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class dingo.core.nn.enets.LinearProjectionRB(input_dims: List[int], n_rb: int, V_rb_list: Tuple | None)

Bases: Module

A compression layer that reduces the input dimensionality via projection onto a reduced basis. The input data is of shape (batch_size, num_blocks, num_channels, num_bins). Each of the num_blocks blocks (for GW use case: block=detector) is treated independently.

A single block consists of 1D data with num_bins bins (e.g. GW use case: num_bins=number of frequency bins). It has num_channels>=2 different channels, channel 0 and 1 store the real and imaginary part of the signal. Channels with index >=2 are used for auxiliary signals (such as PSD for GW use case).

This layer compresses the complex signal in channels 0 and 1 to n_rb reduced-basis (rb) components. This is achieved by initializing the weights of this layer with the rb matrix V, such that the (2*n_rb) dimensional output of each block is the concatenation of the real and imaginary part of the reduced basis projection of the complex signal in channel 0 and 1. The projection of the auxiliary channels with index >=2 onto these components is initialized with 0.

Module specs

input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, 2 * n_rb * num_blocks)

param input_dims:: dimensions of input batch, omitting batch dimension input_dims = [num_blocks, num_channels, num_bins]
type input_dims:: list
param n_rb:: number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks
type n_rb:: int
param V_rb_list:: tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model
type V_rb_list:: tuple of np.arrays, or None

forward(x, **_): RB projection. Additional kwargs (like context) are ignored.

init_layers(V_rb_list): Loop through layers and initialize them individually with the corresponding rb projection. V_rb_list is a list that contains the rb matrix V for each block. Each matrix V in V_rb_list is represented with a numpy array of shape (self.num_bins, num_el), where num_el >= self.n_rb.

property input_dim

property output_dim

test_dimensions(V_rb_list): Test if input dimensions to this layer are consistent with each other, and the reduced basis matrices V.

class dingo.core.nn.enets.ModuleMerger(module_list: Tuple)

Bases: Module

This is a wrapper used to process multiple different kinds of context information collected in x = (x_0, x_1, …). For each kind of context information x_i, an individual embedding network is provided in enets = (enet_0, enet_1, …). The embedded output of the forward method is the concatenation of the individual embeddings enet_i(x_i).

In the GW use case, this wrapper can be used to embed the high-dimensional signal input into a lower dimensional feature vector with a large embedding network, while applying an identity embedding to the time shifts.

Module specs

input dimension: (batch_size, …), (batch_size, …), … output dimension: (batch_size, ?)

param module_list:: nn.Modules for embedding networks, use torch.nn.Identity for identity mappings
type module_list:: tuple

forward(*x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

dingo.core.nn.enets.create_enet_with_projection_layer_and_dense_resnet(input_dims: List[int], V_rb_list: Tuple | None, output_dim: int, hidden_dims: Tuple, svd: dict, activation: str = 'elu', dropout: float = 0.0, batch_norm: bool = True, added_context: bool = False)

Builder function for 2-stage embedding network for 1D data with multiple blocks and channels. Module 1 is a linear layer initialized as the projection of the complex signal onto reduced basis components via the LinearProjectionRB, where the blocks are kept separate. See docstring of LinearProjectionRB for details. Module 2 is a sequence of dense residual layers, that is used to further reduce the dimensionality.

The projection requires the complex signal to be represented via the real part in channel 0 and the imaginary part in channel 1. Auxiliary signals may be contained in channels with indices => 2. In GW use case a block corresponds to a detector and channel 2 is used for ASD information.

If added_context = True, the 2-stage embedding network described above is merged with an identity mapping via ModuleMerger. Then, the expected input is not x with x.shape = (batch_size, num_blocks, num_channels, num_bins), but rather the tuple *(x, z), where z is additional context information. The output of the full module is then the concatenation of enet(x) and z. In GW use case, this is used to concatenate the applied time shifts z to the embedded feature vector of the strain data enet(x).

Module specs

For added_context == False:

input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, output_dim)

For added_context == True:

input dimension: (batch_size, num_blocks, num_channels, num_bins),: (batch_size, N)

output dimension: (batch_size, output_dim + N)

param input_dims:: list dimensions of input batch, omitting batch dimension input_dims = (num_blocks, num_channels, num_bins)
param n_rb:: int number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks
param V_rb_list:: tuple of np.arrays, or None tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model
param output_dim:: int output dimension of the full module
param hidden_dims:: tuple tuple with dimensions of hidden layers of module 2
param activation:: str str that specifies activation function used in residual blocks
param dropout:: float dropout probability for residual blocks used for reqularization
param batch_norm:: bool flag that specifies whether to use batch normalization
param added_context:: bool if set to True, additional context z is concatenated to the embedded feature vector enet(x); note that in this case, the expected input is a tuple with 2 elements, input = (x, z) rather than just the tensor x.
return:: nn.Module

dingo.core.nn.nsf module

Implementation of the neural spline flow (NSF). Most of this code is adapted from the uci.py example from https://github.com/bayesiains/nsf.

class dingo.core.nn.nsf.FlowWrapper(flow: Flow, embedding_net: Module | None = None)

Bases: Module

This class wraps the neural spline flow. It is required for multiple reasons. (i) some embedding networks take tuples as input, which is not supported by the nflows package. (ii) paralellization across multiple GPUs requires a forward method, but the relevant flow method for training is log_prob.

Parameters:

flow – flows.base.Flow
embedding_net – nn.Module

forward(y, *x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

log_prob(y, *x)

sample(*x, num_samples=1)

sample_and_log_prob(*x, num_samples=1)

dingo.core.nn.nsf.create_base_transform(i: int, param_dim: int, context_dim: int | None = None, hidden_dim: int = 512, num_transform_blocks: int = 2, activation: str = 'relu', dropout_probability: float = 0.0, batch_norm: bool = False, num_bins: int = 8, tail_bound: float = 1.0, apply_unconditional_transform: bool = False, base_transform_type: str = 'rq-coupling')

Build a base NSF transform of y, conditioned on x.

This uses the PiecewiseRationalQuadraticCoupling transform or the MaskedPiecewiseRationalQuadraticAutoregressiveTransform, as described in the Neural Spline Flow paper (https://arxiv.org/abs/1906.04032).

Code is adapted from the uci.py example from https://github.com/bayesiains/nsf.

A coupling flow fixes half the components of y, and applies a transform to the remaining components, conditioned on the fixed components. This is a restricted form of an autoregressive transform, with a single split into fixed/transformed components.

The transform here is a neural spline flow, where the flow is parametrized by a residual neural network that depends on y_fixed and x. The residual network consists of a sequence of two-layer fully-connected blocks.

Parameters:

i – int index of transform in sequence
param_dim – int dimensionality of y
context_dim – int = None dimensionality of x
hidden_dim – int = 512 number of hidden units per layer
num_transform_blocks – int = 2 number of transform blocks comprising the transform
activation – str = ‘relu’ activation function
dropout_probability – float = 0.0 dropout probability for regularization
batch_norm – bool = False whether to use batch normalization
num_bins – int = 8 number of bins for the spline
tail_bound – float = 1.
apply_unconditional_transform – bool = False whether to apply an unconditional transform to fixed components
base_transform_type – str = ‘rq-coupling’ type of base transform, one of {rq-coupling, rq-autoregressive}

Returns:

Transform the NSF transform

dingo.core.nn.nsf.create_linear_transform(param_dim: int)

Create the composite linear transform PLU.

Parameters:: param_dim – int dimension of the parameter space
Returns:: nde.Transform the linear transform PLU

dingo.core.nn.nsf.create_nsf_model(input_dim: int, context_dim: int, num_flow_steps: int, base_transform_kwargs: dict, embedding_net_builder: Callable | str | None = None, embedding_kwargs: dict | None = None)

Build NSF model. This models the posterior distribution p(y|x).

The model consists of

a base distribution (StandardNormal, dim(y))
a sequence of transforms, each conditioned on x

Parameters:

input_dim – int, dimensionality of y
context_dim – int, dimensionality of the (embedded) context
num_flow_steps – int, number of sequential transforms
base_transform_kwargs – dict, hyperparameters for transform steps
embedding_net_builder – Callable=None, build function for embedding network TODO
embedding_kwargs – dict=None, hyperparameters for embedding network

Returns:

Flow the NSF (posterior model)

dingo.core.nn.nsf.create_nsf_with_rb_projection_embedding_net(posterior_kwargs: dict, embedding_kwargs: dict, initial_weights: dict | None = None)

Builds a neural spline flow with an embedding network that consists of a reduced basis projection followed by a residual network. Optionally initializes the embedding network weights.

Parameters:

posterior_kwargs (dict) – kwargs for neural spline flow
embedding_kwargs (dict) – kwargs for emebedding network
initial_weights (dict) – Dictionary containing the initial weights for the SVD projection. This should have one key ‘V_rb_list’, with value a list of SVD V matrices (one for each detector).

Returns:

Neural spline flow model

Return type:

nn.Module

dingo.core.nn.nsf.create_nsf_wrapped(**kwargs): Wraps the NSF model in a FlowWrapper. This is required for parallel training, and wraps the log_prob method as a forward method.

dingo.core.nn.nsf.create_transform(num_flow_steps: int, param_dim: int, context_dim: int, base_transform_kwargs: dict)

Build a sequence of NSF transforms, which maps parameters y into the base distribution u (noise). Transforms are conditioned on context data x.

Note that the forward map is f^{-1}(y, x).

Each step in the sequence consists of

A linear transform of y, which in particular permutes components
A NSF transform of y, conditioned on x.

There is one final linear transform at the end.

Parameters:

num_flow_steps – int, number of transforms in sequence
param_dim – int, dimensionality of parameter space (y)
context_dim – int, dimensionality of context (x)
base_transform_kwargs – int hyperparameters for NSF step

Returns:

Transform the NSF transform sequence

dingo.core.nn package

Submodules

dingo.core.nn.cfnets module

dingo.core.nn.enets module

Module specs

Module specs

Module specs

Module specs

dingo.core.nn.nsf module

Module contents