dingo.core.nn package

Submodules

dingo.core.nn.cfnets module

class dingo.core.nn.cfnets.ContinuousFlow(continuous_flow_net: Module, context_embedding_net: Module = Identity(), theta_embedding_net: Module = Identity(), context_with_glu: bool = False, theta_with_glu: bool = False)

Bases: Module

A continuous normalizing flow network. It defines a time-dependent vector field on the parameter space (score or flow), which optionally depends on additional context information.

v = v(f(t, theta), g(context))

This class combines the network v for the continuous flow itself, as well as embedding networks f, g, for the context and parameters, respectively.

The parameters and context can optionally be provided as gated linear unit (GLU) context to the main network, rather than as the main input to the network. For a DenseResidualNet, this context is input repeatedly via GLUs, for each residual block.

Parameters:
  • continuous_flow_net (nn.Module) – Main network for the continuous flow.

  • context_embedding_net (nn.Module = torch.nn.Identity()) – Embedding network for the context information (e.g., observed data).

  • theta_embedding_net (nn.Module = torch.nn.Identity()) – Embedding network for the parameters.

  • context_with_glu (bool = False) – Whether to provide context as GLU or main input to the continuous_flow_net.

  • theta_with_glu (bool = False) – Whether to provide theta (and t) as GLU or main input to the continuous_flow_net.

forward(t, theta, *context)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

property use_cache
class dingo.core.nn.cfnets.PositionalEncoding(nr_frequencies, encode_all=True, base_freq=6.283185307179586)

Bases: Module

Implements positional encoding as commonly used in transformer architectures.

Positional encoding introduces a way to inject information about the order of the input data (e.g., sequence positions) into a neural network that otherwise lacks a sense of position due to its permutation-invariant nature. This class computes sinusoidal encodings based on the position of each element in the input and concatenates them with the original input features.

frequencies

A tensor containing the frequencies used to calculate the sinusoidal components. The frequencies are powers of 2, scaled by the base frequency.

Type:

torch.Tensor

encode_all

Determines whether the positional encoding is applied to all features of the input or only the first feature (e.g., time component).

Type:

bool

base_freq

The base frequency used to scale the sinusoidal components, defaulting to 2 * pi.

Type:

float

Parameters:
  • nr_frequencies (int) – The number of sinusoidal frequencies to compute. This determines the dimensionality of the positional encoding for each input feature.

  • encode_all (bool, optional (default=True)) – If True, the positional encoding is computed for all features in the input. Otherwise, it is computed only for the first feature (e.g., the time dimension).

  • base_freq (float, optional (default=2 * np.pi)) – The base frequency used for sinusoidal encoding.

forward(t_theta)

Computes the positional encoding for the input tensor t_theta and concatenates it with the original input features. - If encode_all is True, the positional encoding is computed for all features. - If encode_all is False, the positional encoding is applied only to the first

feature, such as time, while other features remain unchanged.

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(t_theta)

Computes and concatenates positional encodings with the input tensor.

Parameters:

t_theta (torch.Tensor) – Input tensor of shape (batch_size, input_dim), where input_dim is the dimensionality of the input features.

Returns:

A tensor containing the input features concatenated with the positional encodings. The output shape will be: - (batch_size, input_dim + 2 * nr_frequencies) if encode_all is True. - (batch_size, input_dim + 2 * nr_frequencies) if encode_all is False,

but positional encodings are computed only for the first input feature.

Return type:

torch.Tensor

dingo.core.nn.cfnets.create_cf(posterior_kwargs: dict, embedding_kwargs: dict | None = None, initial_weights: dict | None = None)

Build a continuous flow based on settings dictionaries.

Parameters:
  • posterior_kwargs (dict) – Settings for the flow. This includes the settings for the parameter embedding.

  • embedding_kwargs (dict) – Settings for the context embedding network.

  • initial_weights (dict) – Initial weights for the embedding network (of SVD projection type).

Returns:

Neural network for the continuous flow.

Return type:

nn.Module

dingo.core.nn.cfnets.get_dim_positional_embedding(encoding: dict, input_dim: int)
dingo.core.nn.cfnets.get_theta_embedding_net(embedding_kwargs: dict, input_dim)

dingo.core.nn.enets module

Implementation of embedding networks.

class dingo.core.nn.enets.DenseResidualNet(input_dim: int, output_dim: int, hidden_dims: ~typing.Tuple, activation: ~typing.Callable = <function elu>, dropout: float = 0.0, batch_norm: bool = True, context_features: int | None = None)

Bases: Module

A nn.Module consisting of a sequence of dense residual blocks. This is used to embed high dimensional input to a compressed output. Linear resizing layers are used for resizing the input and output to match the first and last hidden dimension, respectively.

Module specs

input dimension: (batch_size, input_dim) output dimension: (batch_size, output_dim)

param input_dim:

dimension of the input to this module

type input_dim:

int

param output_dim:

output dimension of this module

type output_dim:

int

param hidden_dims:

tuple with dimensions of hidden layers of this module

type hidden_dims:

tuple

param activation:

activation function used in residual blocks

type activation:

callable

param dropout:

dropout probability for residual blocks used for reqularization

type dropout:

float

param batch_norm:

flag that specifies whether to use batch normalization

type batch_norm:

bool

param context_features:

Number of additional context features, which are provided to the residual blocks via gated linear units. If None, no additional context expected.

type context_features:

int

forward(x, context=None)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class dingo.core.nn.enets.LinearProjectionRB(input_dims: List[int], n_rb: int, V_rb_list: Tuple | None)

Bases: Module

A compression layer that reduces the input dimensionality via projection onto a reduced basis. The input data is of shape (batch_size, num_blocks, num_channels, num_bins). Each of the num_blocks blocks (for GW use case: block=detector) is treated independently.

A single block consists of 1D data with num_bins bins (e.g. GW use case: num_bins=number of frequency bins). It has num_channels>=2 different channels, channel 0 and 1 store the real and imaginary part of the signal. Channels with index >=2 are used for auxiliary signals (such as PSD for GW use case).

This layer compresses the complex signal in channels 0 and 1 to n_rb reduced-basis (rb) components. This is achieved by initializing the weights of this layer with the rb matrix V, such that the (2*n_rb) dimensional output of each block is the concatenation of the real and imaginary part of the reduced basis projection of the complex signal in channel 0 and 1. The projection of the auxiliary channels with index >=2 onto these components is initialized with 0.

Module specs

input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, 2 * n_rb * num_blocks)

param input_dims:

dimensions of input batch, omitting batch dimension input_dims = [num_blocks, num_channels, num_bins]

type input_dims:

list

param n_rb:

number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks

type n_rb:

int

param V_rb_list:

tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model

type V_rb_list:

tuple of np.arrays, or None

forward(x, **_)

RB projection. Additional kwargs (like context) are ignored.

init_layers(V_rb_list)

Loop through layers and initialize them individually with the corresponding rb projection. V_rb_list is a list that contains the rb matrix V for each block. Each matrix V in V_rb_list is represented with a numpy array of shape (self.num_bins, num_el), where num_el >= self.n_rb.

property input_dim
property output_dim
test_dimensions(V_rb_list)

Test if input dimensions to this layer are consistent with each other, and the reduced basis matrices V.

class dingo.core.nn.enets.ModuleMerger(module_list: Tuple)

Bases: Module

This is a wrapper used to process multiple different kinds of context information collected in x = (x_0, x_1, …). For each kind of context information x_i, an individual embedding network is provided in enets = (enet_0, enet_1, …). The embedded output of the forward method is the concatenation of the individual embeddings enet_i(x_i).

In the GW use case, this wrapper can be used to embed the high-dimensional signal input into a lower dimensional feature vector with a large embedding network, while applying an identity embedding to the time shifts.

Module specs

input dimension: (batch_size, …), (batch_size, …), … output dimension: (batch_size, ?)

param module_list:

nn.Modules for embedding networks, use torch.nn.Identity for identity mappings

type module_list:

tuple

forward(*x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

dingo.core.nn.enets.create_enet_with_projection_layer_and_dense_resnet(input_dims: List[int], V_rb_list: Tuple | None, output_dim: int, hidden_dims: Tuple, svd: dict, activation: str = 'elu', dropout: float = 0.0, batch_norm: bool = True, added_context: bool = False)

Builder function for 2-stage embedding network for 1D data with multiple blocks and channels. Module 1 is a linear layer initialized as the projection of the complex signal onto reduced basis components via the LinearProjectionRB, where the blocks are kept separate. See docstring of LinearProjectionRB for details. Module 2 is a sequence of dense residual layers, that is used to further reduce the dimensionality.

The projection requires the complex signal to be represented via the real part in channel 0 and the imaginary part in channel 1. Auxiliary signals may be contained in channels with indices => 2. In GW use case a block corresponds to a detector and channel 2 is used for ASD information.

If added_context = True, the 2-stage embedding network described above is merged with an identity mapping via ModuleMerger. Then, the expected input is not x with x.shape = (batch_size, num_blocks, num_channels, num_bins), but rather the tuple *(x, z), where z is additional context information. The output of the full module is then the concatenation of enet(x) and z. In GW use case, this is used to concatenate the applied time shifts z to the embedded feature vector of the strain data enet(x).

Module specs

For added_context == False:

input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, output_dim)

For added_context == True:
input dimension: (batch_size, num_blocks, num_channels, num_bins),

(batch_size, N)

output dimension: (batch_size, output_dim + N)

param input_dims:

list dimensions of input batch, omitting batch dimension input_dims = (num_blocks, num_channels, num_bins)

param n_rb:

int number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks

param V_rb_list:

tuple of np.arrays, or None tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model

param output_dim:

int output dimension of the full module

param hidden_dims:

tuple tuple with dimensions of hidden layers of module 2

param activation:

str str that specifies activation function used in residual blocks

param dropout:

float dropout probability for residual blocks used for reqularization

param batch_norm:

bool flag that specifies whether to use batch normalization

param added_context:

bool if set to True, additional context z is concatenated to the embedded feature vector enet(x); note that in this case, the expected input is a tuple with 2 elements, input = (x, z) rather than just the tensor x.

return:

nn.Module

dingo.core.nn.nsf module

Implementation of the neural spline flow (NSF). Most of this code is adapted from the uci.py example from https://github.com/bayesiains/nsf.

class dingo.core.nn.nsf.FlowWrapper(flow: Flow, embedding_net: Module | None = None)

Bases: Module

This class wraps the neural spline flow. It is required for multiple reasons. (i) some embedding networks take tuples as input, which is not supported by the nflows package. (ii) paralellization across multiple GPUs requires a forward method, but the relevant flow method for training is log_prob.

Parameters:
  • flow – flows.base.Flow

  • embedding_net – nn.Module

forward(y, *x)

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

log_prob(y, *x)
sample(*x, num_samples=1)
sample_and_log_prob(*x, num_samples=1)
dingo.core.nn.nsf.create_base_transform(i: int, param_dim: int, context_dim: int | None = None, hidden_dim: int = 512, num_transform_blocks: int = 2, activation: str = 'relu', dropout_probability: float = 0.0, batch_norm: bool = False, num_bins: int = 8, tail_bound: float = 1.0, apply_unconditional_transform: bool = False, base_transform_type: str = 'rq-coupling')

Build a base NSF transform of y, conditioned on x.

This uses the PiecewiseRationalQuadraticCoupling transform or the MaskedPiecewiseRationalQuadraticAutoregressiveTransform, as described in the Neural Spline Flow paper (https://arxiv.org/abs/1906.04032).

Code is adapted from the uci.py example from https://github.com/bayesiains/nsf.

A coupling flow fixes half the components of y, and applies a transform to the remaining components, conditioned on the fixed components. This is a restricted form of an autoregressive transform, with a single split into fixed/transformed components.

The transform here is a neural spline flow, where the flow is parametrized by a residual neural network that depends on y_fixed and x. The residual network consists of a sequence of two-layer fully-connected blocks.

Parameters:
  • i – int index of transform in sequence

  • param_dim – int dimensionality of y

  • context_dim – int = None dimensionality of x

  • hidden_dim – int = 512 number of hidden units per layer

  • num_transform_blocks – int = 2 number of transform blocks comprising the transform

  • activation – str = ‘relu’ activation function

  • dropout_probability – float = 0.0 dropout probability for regularization

  • batch_norm – bool = False whether to use batch normalization

  • num_bins – int = 8 number of bins for the spline

  • tail_bound – float = 1.

  • apply_unconditional_transform – bool = False whether to apply an unconditional transform to fixed components

  • base_transform_type – str = ‘rq-coupling’ type of base transform, one of {rq-coupling, rq-autoregressive}

Returns:

Transform the NSF transform

dingo.core.nn.nsf.create_linear_transform(param_dim: int)

Create the composite linear transform PLU.

Parameters:

param_dim – int dimension of the parameter space

Returns:

nde.Transform the linear transform PLU

dingo.core.nn.nsf.create_nsf_model(input_dim: int, context_dim: int, num_flow_steps: int, base_transform_kwargs: dict, embedding_net_builder: Callable | str | None = None, embedding_kwargs: dict | None = None)

Build NSF model. This models the posterior distribution p(y|x).

The model consists of
  • a base distribution (StandardNormal, dim(y))

  • a sequence of transforms, each conditioned on x

Parameters:
  • input_dim – int, dimensionality of y

  • context_dim – int, dimensionality of the (embedded) context

  • num_flow_steps – int, number of sequential transforms

  • base_transform_kwargs – dict, hyperparameters for transform steps

  • embedding_net_builder – Callable=None, build function for embedding network TODO

  • embedding_kwargs – dict=None, hyperparameters for embedding network

Returns:

Flow the NSF (posterior model)

dingo.core.nn.nsf.create_nsf_with_rb_projection_embedding_net(posterior_kwargs: dict, embedding_kwargs: dict, initial_weights: dict | None = None)

Builds a neural spline flow with an embedding network that consists of a reduced basis projection followed by a residual network. Optionally initializes the embedding network weights.

Parameters:
  • posterior_kwargs (dict) – kwargs for neural spline flow

  • embedding_kwargs (dict) – kwargs for emebedding network

  • initial_weights (dict) – Dictionary containing the initial weights for the SVD projection. This should have one key ‘V_rb_list’, with value a list of SVD V matrices (one for each detector).

Returns:

Neural spline flow model

Return type:

nn.Module

dingo.core.nn.nsf.create_nsf_wrapped(**kwargs)

Wraps the NSF model in a FlowWrapper. This is required for parallel training, and wraps the log_prob method as a forward method.

dingo.core.nn.nsf.create_transform(num_flow_steps: int, param_dim: int, context_dim: int, base_transform_kwargs: dict)

Build a sequence of NSF transforms, which maps parameters y into the base distribution u (noise). Transforms are conditioned on context data x.

Note that the forward map is f^{-1}(y, x).

Each step in the sequence consists of
  • A linear transform of y, which in particular permutes components

  • A NSF transform of y, conditioned on x.

There is one final linear transform at the end.

Parameters:
  • num_flow_steps – int, number of transforms in sequence

  • param_dim – int, dimensionality of parameter space (y)

  • context_dim – int, dimensionality of context (x)

  • base_transform_kwargs – int hyperparameters for NSF step

Returns:

Transform the NSF transform sequence

Module contents