Dingo
Dingo (Deep Inference for Gravitational-wave Observations) is a Python program for analyzing gravitational wave data using neural posterior estimation. It dramatically speeds up inference of astrophysical source parameters from data measured at gravitational-wave observatories. Dingo aims to enable the routine use of the most advanced theoretical models in analysing data, to make rapid predictions for multi-messenger counterparts, and to do so in the context of sensitive detectors with high event rates.
The basic approach of Dingo is to train a neural network to represent the Bayesian posterior, conditioned on data. This enables amortized inference: when new data are observed, they can be plugged in and results obtained in a small amount of time. Tasks handled by Dingo include
training normalizing flows to estimate the posterior density;
performing inference on real or simulated data; and
verifying and correcting model results using importance sampling.
As training a network from scratch can be expensive, we intend to also distribute trained networks that can be used directly for inference. These can be used with dingo_pipe to automate analysis of gravitational wave events.
Installation
Standard
Pip
To install using pip, run the following within a suitable virtual environment:
pip install dingo-gw
This will install Dingo as well as all of its requirements, which are listed in pyproject.toml.
Conda
Dingo is also available from the conda-forge repository. To install using conda, first activate a conda environment, and then run
conda install -c conda-forge dingo-gw
Development
If you would like to make changes to Dingo, or to contribute to its development, you should install Dingo from source. To do so, first clone this repository:
git clone git@github.com:dingo-gw/dingo.git
Next create a virtual environment for Dingo, e.g.,
python3 -m venv dingo-venv
source dingo-venv/bin/activate
This creates and activates a venv for Dingo
called dingo-venv
. In this virtual environment, install Dingo:
cd dingo
pip install -e ."[dev]"
This command installs an editable version of Dingo, meaning that any changes to the Dingo
source are reflected immediately in the installation. The inclusion of dev
installs
extra packages needed for development (code formatting, compiling documentation, etc.)
Documentation
To build the documentation, first generate the API documentation using autodoc:
cd docs
sphinx-apidoc -o source ../dingo
This will create dingo.*.rst
and modules.rst
files in source/
. These correspond to
the various modules and are constructed from docstrings.
To finally compile the documentation, run
make html
This creates a directory build/
containing HTML documentation. The main index is at build/html/index.html
.
To use the autodoc feature, which works for pycharm and numpy docstrings, insert in a .rst file, e.g.,
.. autofunction:: dingo.core.utils.trainutils.write_history`
This will render as
- dingo.core.utils.trainutils.write_history(log_dir, epoch, train_loss, test_loss, learning_rates, aux=None, filename='history.txt')
Writes losses and learning rate history to csv file.
- Parameters:
log_dir (str) – directory containing the history file
epoch (int) – epoch
train_loss (float) – train_loss of epoch
test_loss (float) – test_loss of epoch
learning_rates (list) – list of learning rates in epoch
aux (list = []) – list of auxiliary information to be logged
filename (str = 'history.txt') – name of history file
Cleanup
To remove generated docs, execute
make clean
rm source/dingo.* source/modules.rst
Overview
Dingo performs gravitational-wave (GW) parameter estimation using neural posterior estimation. The basic idea is to train a neural network (a normalizing flow) to represent the Bayesian posterior distribution \(p(\theta|d)\) for GW parameters \(\theta\) given observed data \(d\). Training can take some time (typically, a week for a production-level model) but once trained, inference is very fast (just a few seconds).
Basic workflow
The basic workflow for using Dingo is as follows:
Prepare training data. This consists of pairs of intrinsic parameters and waveform polarizations, as well as noise PSDs. Training parameters are drawn from the prior distribution, and waveforms are simulated using a waveform model.
Train a model. Build a neural network and simulate data sets (noisy waveforms in detectors). Train the model to infer parameters based on the data.
Perform inference on new data using the trained model.
In many cases, a user may have downloaded a pre-trained model. If so, there is no need to carry out the first two steps, and one may instead skip to step 3.
Command-line interface
In most cases, we expect Dingo to be called from the command line. Dingo commands begin with the prefix dingo_
. There can be a large number of configurations options for many tasks, so in such cases, rather than specify all settings as arguments, Dingo commands take a single YAML or INI file containing all settings. As described in the quickstart tutorial, it is best to begin with settings files provided in the examples/ folder, modifying them as necessary.
Summary of commands
Here we provide a list of key user commands along with brief descriptions. The commands for carrying out the main tasks above are
Command |
Description |
---|---|
|
Generate a training dataset of waveform polarizations. |
|
Generate a training dataset of detector noise ASDs. |
|
Build and train a neural network. |
|
Perform inference on data (real or simulated), starting from an INI file. |
Building a training dataset and training a model can be very expensive tasks. We therefore expect these to be frequently run on clusters, and for this reason provided HTCondor versions of these commands (note that dingo_pipe
is already HTCondor-compatible):
Command |
Description |
---|---|
|
HTCondor version of |
|
HTCondor version of |
Finally, there are several utility commands that are useful for working with Dingo-produced files:
Command |
Description |
---|---|
|
Inspect a file produced by Dingo and print a summary. |
|
Modify the training plan of a model checkpoint. |
|
Convert a trained Dingo model from a PyTorch pickle .pt file to HDF5. |
Hint
The dingo_ls
command is very useful for inspecting Dingo files. It will print all settings that went in to producing the file, as well as some derived quantities.
File types
As noted above, most Dingo commands take a YAML file to specify configuration options (except for dingo_pipe
, which uses an INI file, as is standard for LVK parameter estimation). When run, these commands generate data, which is usually stored in HDF5 files. One exception is when training a neural network. This saves the network weights using the PyTorch .pt
format. However, primarily for LVK use, dingo_pt_to_hdf5
can convert the weights of a trained model to a HDF5 file.
Important
In all cases, Dingo will save the YAML file settings within the final output file. This is needed for downstream tasks and for maintaining reproducibility.
GNPE
A slightly more complicated workflow occurs when using GNPE. GNPE is an algorithm that combines physical symmetries with Gibbs sampling to significantly improve results. When using GNPE, however, it is necessary to train two networks—one main (conditional) network that will be repeatedly sampled during Gibbs sampling and one smaller network used to initialize the Gibbs sampler. At inference time, dingo_pipe
must be pointed to both of these networks. See the section on GNPE usage for further details.
Quickstart tutorial
To learn to use Dingo, we recommend starting with the examples provided in the examples/
folder. The YAML files contained in this directory (and subdirectories) contain
configuration settings for the various Dingo tasks (constructing training data, training networks, and performing inference). These files should be provided as input to the
command-line scripts, which then run Dingo and save output files. These output files
contain as metadata the settings in the YAML files, and they may usually be inspected
by running dingo_ls
.
After configuring the settings files, the scripts may be used as follows, assuming the
Dingo venv
is active.
Generate training data
Waveforms
To generate a waveform dataset for training, execute
dingo_generate_dataset --settings_file waveform_dataset_settings.yaml --num_processes N --out_file waveform_dataset.hdf5
where N
is the number of processes you would like to use to generate the waveforms in
parallel. This saves the dataset of waveform polarizations in the
file waveform_dataset.hdf5
(typically compressed using SVD, depending on configuration).
One can use dingo_generate_dataset_dag
to set up a condor DAG for generating waveforms
on a cluster. This is typically useful for slower waveform models.
Noise ASDs
Training also requires a dataset of noise ASDs, which are sampled randomly for each training sample. To generate this dataset based on noise observed during a run, execute
dingo_generate_ASD_dataset --data_dir data_dir --settings_file asd_dataset_settings.yaml
This will download data from GWOSC and create a /tmp
directory, in which the
estimated PSDs are stored. Subsequently, these are collected together into a final .hdf5
ASD dataset.
If no settings_file
is passed, the script will attempt to use the default
one data_dir/asd_dataset_settings.yaml
.
Training
With a waveform dataset and ASD dataset(s), one can train a neural network. Configure
the train_settings.yaml
file to point to these datasets, and run
dingo_train --settings_file train_settings.yaml --train_dir train_dir
This will configure the network, train it, and store checkpoints, a record of the history,
and the final network in the directory train_dir
. Alternatively, to resume training from
a checkpoint file, run
dingo_train --checkpoint model.pt --train_dir train_dir
If using CUDA on a machine with several GPUs, be sure to first select the desired GPU
number using the CUDA_VISIBLE_DEVICES
environment variable. If using a cluster, Dingo
can be trained using dingo_train_condor
.
Example training files can be found under examples/training
.
train_settings_toy.yaml
and train_settings_production.yaml
train a flow to
estimate the full posterior of the event conditioned on the time of coalescence
in the detectors. The “toy” label is to indicate this should NOT be used for production but
rather to get a feel for the Dingo pipeline. The production settings contain tested
settings. Note that depending on the waveform model and event, these may need to occasionally
be tuned. train_settings_init_toy.yaml
and train_settings_init_production.yaml
train
flows to estimate the time of coalescence in the individual detectors. These two
networks are needed to use GNPE. This is the preferred and
most tested way of using Dingo.
Alternatively, the train_settings_no_gnpe_toy.yaml
and
train_settings_no_gnpe_production.yaml
contain settings to train a network
without the GNPE step. Note the lack of a data/gnpe_time_shifts
option. While this is not
recommended for production, it is still pedagogically useful and is good for prototyping
new ideas or doing a less expensive training.
Inference
Once a Dingo model is trained, inference for real events can be performed using
dingo_pipe. There are 3 main inference steps, downloading the data,
running Dingo on this data and finally running importance sampling. The basic
idea is to create a .ini file which contains the filepaths of the Dingo networks
trained above and the segment of data to analyze. An example .ini file can be
found under examples/pipe/GW150914.ini
.
To do inference, cd into the directory with the .ini file and run
dingo_pipe GW150914.ini
Toy Example
The goal of the following tutorial is to take a user from start to finish analyzing GW150914 using dingo.
Caution
This is only a toy example which is useful for testing on a local machine. This is NOT meant be used for production gravitational wave analyses.
There are 4 main steps:
Generate the waveform dataset
Generate the ASD dataset
Train the network
Do inference
In this tutorial as well as the npe model and gnpe model the following file structure will be employed
toy_npe_model/
# config files
waveform_dataset_settings.yaml
asd_dataset_settings.yaml
train_settings.yaml
GW150914.ini
training_data/
waveform_dataset.hdf5
asd_dataset/ # Contains the asd_dataset.hdf5 and also temp files for asd generation
training/
model_050.pt
model_stage_0.pt
model_latest.pt
history.txt
# etc...
outdir_GW150914/
# dingo_pipe output
The config files which are the only ones which need to be edited are contained in the top level directory. In the next
few sections these config files will be explained. To download sample config files, please visit
https://github.com/dingo-gw/dingo/tree/main/examples. In this tutorial the toy_npe_model
folder will be used.
Step 1 Generating a waveform dataset
After downloading the files for the tutorial first run
cd toy_npe_model/
mkdir training_data
mkdir training
to set up the file structure. Then run
dingo_generate_dataset --settings waveform_dataset_settings.yaml --out_file training_data/waveform_dataset.hdf5
which will create a
dingo.gw.waveform_generator.waveform_generator.WaveformGenerator
object and store it at the location provided with --out_file
. For convenience,
here is the waveform dataset file
domain:
type: FrequencyDomain
f_min: 20.0
f_max: 1024.0
delta_f: 0.25 # Expressions like 1.0/8.0 would require eval and are not supported
waveform_generator:
approximant: IMRPhenomD
f_ref: 20.0
# f_start: 15.0 # Optional setting useful for EOB waveforms. Overrides f_min when generating waveforms.
# Dataset only samples over intrinsic parameters. Extrinsic parameters are chosen at train time.
intrinsic_prior:
mass_1: bilby.core.prior.Constraint(minimum=10.0, maximum=80.0)
mass_2: bilby.core.prior.Constraint(minimum=10.0, maximum=80.0)
chirp_mass: bilby.gw.prior.UniformInComponentsChirpMass(minimum=15.0, maximum=100.0)
mass_ratio: bilby.gw.prior.UniformInComponentsMassRatio(minimum=0.125, maximum=1.0)
phase: default
chi_1: bilby.gw.prior.AlignedSpin(name='chi_1', a_prior=Uniform(minimum=0, maximum=0.9))
chi_2: bilby.gw.prior.AlignedSpin(name='chi_2', a_prior=Uniform(minimum=0, maximum=0.9))
theta_jn: default
# Reference values for fixed (extrinsic) parameters. These are needed to generate a waveform.
luminosity_distance: 100.0 # Mpc
geocent_time: 0.0 # s
# Dataset size
num_samples: 10000
compression: None
The file waveform_dataset_settings.yaml
contains four
sections: domain
, waveform_generator
, intrinsic_prior
, and compression
. The
domain section defines the settings for storing the waveform. Note the type
attribute; this does not refer to the native domain of the waveform model, but
rather to the internal dingo.gw.domains.Domain
class. This allows the use
of time domain waveform models, which are transformed into Fourier domain before
being passed to the network. Currently, only
the dingo.gw.domains.FrequencyDomain
class is supported for training the
network. It is sometimes advisable to generate waveforms with a higher f_max
and then
truncate them at a lower f_max
for training due to issues with generating short waveforms
for some of the waveform models implemented in LALSuite’s LALSimulation package
(https://lscsoft.docs.ligo.org/lalsuite/lalsimulation/).
The waveform_generator
section specifies the approximant
attribute.
At present any waveform model, aka approximant
, that is callable through LALSimulation’s
SimInspiralFD
API can be used to generate waveforms for dingo via the
dingo.gw.waveform_generator.waveform_generator.WaveformGenerator
module (see
generating_waveforms).
The intrinsic_prior
section is based on Bilby’s prior module.
Default values can be found in dingo.gw.prior
.
Two priors to note are the chirp_mass
and mass_ratio
, whose minimum values are set
to 15.0 and 0.125, respectively. Extending these priors towards lower chirp masses
or more extreme mass-ratios may lead to poor performance of the embedding network and normalizing
flow during training and would require changes to the network setup.
Note that the luminosity_distance
and geocent_time
are defined as constants
to generate the waveform at a fixed reference point.
The compression section can be set to None for testing purposes. For a practical example of how it is used, see the next tutorial.
Step 2 Generating the Amplitude Spectral Density (ASD) dataset
To generate an ASD dataset run
dingo_generate_asd_dataset --settings_file asd_dataset_settings.yaml --data_dir training_data/asd_dataset
This command will generate an dingo.gw.noise.asd_dataset.ASDDataset
object in the form of an .hdf5 file, which will be used later for training. The reason for specifying a folder instead of a file, as in the waveform dataset example, is because some temporary data is downloaded to create Welch estimates of the ASD. This data can be removed later, but it is sometimes useful for understanding how the ASDs were estimated. For convenience here is a copy of the asd_dataset_settings.yaml
file.
dataset_settings:
f_s: 4096
time_psd: 1024
T: 4
window:
roll_off: 0.4
type: tukey
time_gap: 0 # specifies the time skipped between to consecutive PSD estimates. If set < 0, the time segments overlap
num_psds_max: 1 # if set > 0, only a subset of all available PSDs will be used
detectors:
- H1
- L1
observing_run: O1
The asd_dataset_settings.yaml
file includes several attributes. f_s
is the sampling frequency in Hz, time_psd
is the length of time used for an ASD estimate, and T
is the duration of each ASD segment. Thus, the value of time_psd
/T
gives the number of segments analyzed to estimate one ASD. To avoid spectral leakage, a window is applied to each segment. We use the standard window used in LVK analyses, a Tukey window with a roll off of \(\alpha=0.4\). The next attribute, num_psds_max=1
, defines the number of ASDs stored in the ASD dataset. For now, we will use only one. See the next tutorial for a more advanced setup.
Step 3 Training the network
To train the network, first the paths to the correct datasets must be specfied
dingo_train --settings_file train_settings.yaml --train_dir training
While this file contains numerous settings that are discussed in training, we will cover the most significant ones here. Again here is the file.
data:
waveform_dataset_path: training_data/waveform_dataset.hdf5 # Contains intrinsic waveforms
train_fraction: 0.95
window: # Needed to calculate window factor for simulated data
type: tukey
f_s: 4096
T: 4.0
roll_off: 0.4
detectors:
- H1
- L1
extrinsic_prior: # Sampled at train time
dec: default
ra: default
geocent_time: bilby.core.prior.Uniform(minimum=-0.10, maximum=0.10)
psi: default
luminosity_distance: bilby.core.prior.Uniform(minimum=100.0, maximum=1000.0)
ref_time: 1126259462.391
inference_parameters:
- chirp_mass
- mass_ratio
- chi_1
- chi_2
- theta_jn
- dec
- ra
- geocent_time
- luminosity_distance
- psi
- phase
# Model architecture
model:
type: nsf+embedding
# kwargs for neural spline flow
nsf_kwargs:
num_flow_steps: 5
base_transform_kwargs:
hidden_dim: 64
num_transform_blocks: 5
activation: elu
dropout_probability: 0.0
batch_norm: True
num_bins: 8
base_transform_type: rq-coupling
# kwargs for embedding net
embedding_net_kwargs:
output_dim: 128
hidden_dims: [1024, 512, 256, 128]
activation: elu
dropout: 0.0
batch_norm: True
svd:
num_training_samples: 1000
num_validation_samples: 100
size: 50
# The first stage (and only) stage of training.
training:
stage_0:
epochs: 20
asd_dataset_path: training_data/asd_dataset/asds_O1.hdf5 # this should just contain a single fiducial ASD per detector for pretraining
freeze_rb_layer: True
optimizer:
type: adam
lr: 0.0001
scheduler:
type: cosine
T_max: 20
batch_size: 64
# Local settings for training that have no impact on the final trained network.
local:
device: cpu # Change this to 'cuda' for training on a GPU.
num_workers: 6 # num_workers >0 does not work on Mac, see https://stackoverflow.com/questions/64772335/pytorch-w-parallelnative-cpp206
runtime_limits:
max_time_per_run: 36000
max_epochs_per_run: 30
checkpoint_epochs: 15
For training, several extrinsic_priors
are set, which project the waveforms generated in step 1 onto the detector network according to the specified priors. This is considerably cheaper than generating waveforms sampled from the full intrinsic plus extrinsic prior in step 1.
Another crucial setting is inference_parameters
. By default all the parameters described in dingo.gw.prior
are inferred. If a parameter needs to be marginalized over this parameter can be omitted from inference_parameters
.
Essential settings for the model architecture (the neural spline flow and the embedding network) are as follows: nsf_kwargs.num_flow_steps
describes the number of flow transforms from the base distribution to the final distribution, while embedding_net_kwargs.hidden_dim
defines the dimensions of the neural network’s hidden layer, which selects the most important data features. Finally, embedding_net_kwargs.svd
describes the settings of the SVD used as a pre-processing step before passing data vectors to the embedding network. For a production network, these values should be much higher than those used in this tutorial.
Next, we turn to the training section. Here we only employ a single stage of training with settings provided under the stage_0
attribute. This stage uses the training dataset generated in step 1 for 30 epochs. We also specify the asd_dataset_path
here, which was created in step 2.
Finally, the local settings section affects only parallelization during training and the device used. An important setting here is num_workers
, which determines how many PyTorch dataloader processes are spawned during training. If training is too slow, a potential cause is a lack of workers to load data into the network. This can be identified if the dataloader times in the dingo_train
output exceed 100ms. The solution is generally to increase the number of workers.
Step 4 Doing Inference
The final step is to do inference, for example on GW150914. To do this we will use dingo_pipe. For a local run execute:
dingo_pipe GW150914.ini
This calls dingo_pipe
on an INI file that specifies the event to run on,
################################################################################
## Job submission arguments
################################################################################
local = True
accounting = dingo
request-cpus-importance-sampling = 2
################################################################################
## Sampler arguments
################################################################################
model = training/model_latest.pt
device = 'cpu'
num-samples = 5000
batch-size = 5000
recover-log-prob = false
importance-sample = false
################################################################################
## Data generation arguments
################################################################################
trigger-time = GW150914
label = GW150914
outdir = outdir_GW150914
channel-dict = {H1:GWOSC, L1:GWOSC}
psd-length = 128
# sampling-frequency = 2048.0
# importance-sampling-updates = {'duration': 4.0}
################################################################################
## Plotting arguments
################################################################################
plot-corner = true
plot-weights = true
plot-log-probs = true
This will generate files which are described in dingo_pipe. To see the results, take a look in outdir_GW150914
. We set the flag importance-sample = False
in the INI file, which disables importance sampling for this simple example. Generally one would omit this (it defaults to True).
We can load and manipulate the data with the following code. For example, here we create a cornerplot
from dingo.gw.result import Result
result = Result(file_name="outdir_GW150914/result/GW150914_data0_1126259462-4_sampling.hdf5")
result.plot_corner()
Notice the results don’t look very promising, but this is expected as the settings used in this example are not enough to warrant convergence. Dingo should also automatically generate a cornerplot which will be displayed under outdir_GW150914.
NPE Model (production)
We will now do a tutorial with higher profile settings. Note these are not the full production settings used for runs since we are not using GNPE, but they should lead to decent results. Go to this tutorial for the full production network. The steps are the essentially same as the toy example but with higher level settings. It is recommended to run this on a cluster or GPU machine.
We can repeat the same first few steps from the previous tutorial with a couple
differences. The file structure is mostly the same but now there is an additional
asd_dataset_fiducial
which will be explained below.
npe_model/
# config files
waveform_dataset_settings.yaml
asd_dataset_settings.yaml
asd_dataset_settings_fiducial.yaml
train_settings.yaml
GW150914.ini
training_data/
waveform_dataset.hdf5
asd_dataset_fiducial/ # Contains the asd_dataset.hdf5 and also temp files for asd generation
asd_dataset/ # Contains the asd_dataset.hdf5 and also temp files for asd generation
training/
model_050.pt
model_stage_0.pt
model_latest.pt
history.txt
# etc...
outdir_GW150914/
# dingo_pipe output
Step 1 Generating a Waveform Dataset
Again the first step is to generate the necessary folders
cd npe_model
mkdir training_data
mkdir training
As before we run dingo_generate_dataset
:
dingo_generate_dataset --settings waveform_dataset_settings.yaml --out_file training_data/waveform_dataset.hdf5
The waveform_dataset_settings.yaml
settings file now includes a new attribute compression
.
This creates a truncated singular value decomposition (SVD) of the waveform polarizations which
is stored on disk as a compressed representation of the dataset. The size
attribute
refers to the number of basis vectors included in the expansion of the waveform. This can later be
changed during training. When the compression phase is finished, the log will
display the mismatch between the decompressed waveform and generated waveform. You can
also access these mismatch settings by running dingo_ls
on a generated waveform_dataset.hdf5
file. It will show multiple mismatches corresponding to the number of basis vectors used
to decompress the waveform. It is up to the user as to what type of mismatch is acceptable,
typically a maximum mismatch of \(10^{-3}-10^{-4}\) is recommended.
We could also generate the waveform dataset using a condor DAG on a cluster. To do this run
dingo_generate_dataset_dag --settings_file waveform_dataset_settings.yaml --out_file training_data/waveform_dataset.hdf5 --env_path $DINGO_VENV_PATH --num_jobs 4 --request_cpus 64 --request_memory 128000 --request_memory_high 256000
and then submit the generated DAG
condor_submit_dag condor/submit/dingo_generate_dataset_dagman_DATE.submit
where DATE
is specified in the filename of the .submit
file that was generated.
Step 2 Generating an ASD dataset
To generate an ASD dataset we can run the same command as in the previous tutorial.
dingo_generate_asd_dataset --settings_file asd_dataset_settings_fiducial.yaml --data_dir training_data/asd_dataset_fiducial -out_name training_data/asd_dataset_fiducial/asds_O1_fiducial.hdf5
However, this time, during training we will need two sets of ASDs. The first one will be fixed during the initial training – this is the fiducial dataset generated above. This dataset will contain only a single ASD. The second ASDDataset will contain many ASDs and is used during the fine tuning stage. The reason to use just one ASD during the first stage is to allow the network to train in an easier inference setting. It should learn how to infer parameters in the presence of that one ASD. However, during inference the ASD will be variable. Thus, in the second stage many ASDs are used so that dingo learns the distribution of ASDs from the observing run. We find this split leads to an improvement in overall performance. To generate this second dataset run
dingo_generate_asd_dataset --settings_file asd_dataset_settings.yaml --data_dir training_data/asd_dataset -out_name training_data/asd_dataset/asds_O1.hdf5
We can see that in asd_dataset_settings.yaml
the num_psds_max
attribute is set to 0
indicating that all possible ASDs will be downloaded. If you want to
decrease this, make sure that there are enough ASDs in the training set to represent
any possible data the dingo network will see. Typically this should be at least 1000,
but of course more is better.
Step 3 Training the network
Now we are ready for training. The command is analogous to the previous tutorial but the settings are increased to production values. To run the training do
dingo_train --settings_file train_settings.yaml --train_dir training
Tip
If running on a machine with multiple GPUs make sure to specify the GPU by running
export CUDA_VISIBILE_DEVICES=GPU_NUM
before running dingo_train
The main difference from the toy example in the network architecture is the size of the embedding
network which is described in model.embedding_net_kwargs.hidden_dims
and the
number of neural spline flow transforms described in
model.nsf_kwargs.num_flow_steps
. These increase the depth of the network and the
number/size of the layers in the embedding network.
Notice, we are not inferring the phase parameter here as it is not listed below inference_parameters
. However,
we do recover the phase in post processing. To see why and how this is done see synthetic phase
Also notice there are now two training stages stage_0
and stage_1
. In stage_0
a fixed ASD is used and the reduced basis layer
is frozen. Then in stage_1
all ASDs are used and the reduced basis layer is unfrozen.
The main difference in the local settings is that the device
is set to CUDA
.
Note if you have multiple GPUs on the machine, you can select which GPU to use
by running
Important
It is recommended to have at least 40 GB of GPU memory on the device. If there is not enough memory on the machine,
first try halving the batch_size
. In this case one should also multiply the learning rate, lr
, by \(\frac{1}{\sqrt{2}}\). If
there is still not enough memory, consider reducing the number of hidden dimensions.
Step 4 Doing Inference
We can run inference with the same command as before
dingo_pipe GW150914.ini
There is just one difference from the previous example. It is possible to reweight the posterior to a new prior.
Note though, that the new prior must be a subset of the previous prior. Otherwise, the proposal distribution
generated by dingo will include regions from the new prior where the network has not been trained which will
result in a low effective sample size and lead to poor results. As an example see the prior-dict
attribute in
GW150914.ini
.
GNPE model (production)
This tutorial has the highest profile settings and is the one typically used for production use. The main difference from the NPE tutorial is that here we are now using GNPE (group neural posterior estimation). The data generation is exactly the same as the previous tutorial, but we repeat it here, for completeness.
The file structure is similar to the NPE example, except now there are two
training sub-directories and two train_settings.yaml
files.
gnpe_model/
# config files
waveform_dataset_settings.yaml
asd_dataset_settings_fiducial.yaml
asd_dataset_settings.yaml
train_settings_main.yaml
train_settings_init.yaml
GW150914.ini
training_data/
waveform_dataset.hdf5
asd_dataset.hdf5
asd_dataset_fiducial.hdf5
asd_dataset_fiducial/ # Contains the asd_dataset.hdf5 and also temp files for asd generation
asd_dataset/ # Contains the asd_dataset.hdf5 and also temp files for asd generation
training/
main_train_dir/
model_050.pt
model_stage_0.pt
model_latest.pt
history.txt
# etc...
init_train_dir/
model_050.pt
model_stage_0.pt
model_latest.pt
history.txt
# etc...
outdir_GW150914/
# dingo_pipe output
Step 1 Generating a Waveform Dataset
First generate the directory structure:
cd gnpe_model
mkdir training_data
mkdir training
mkdir training/main_train_dir
mkdir training/init_train_dir
Generate the waveform dataset:
dingo_generate_dataset --settings waveform_dataset_settings.yaml --out_file training_data/waveform_dataset.hdf5
or using condor:
dingo_generate_dataset_dag --settings_file
waveform_dataset_settings.yaml --out_file
training_data/waveform_dataset.hdf5 --env_path $DINGO_VENV_PATH --num_jobs 4
--request_cpus 16 --request_memory 1280000 --request_memory_high 256000
Step 2 Generating an ASD dataset
As before we generate a fiducial ASD dataset containing a single ASD:
dingo_generate_asd_dataset --settings_file asd_dataset_settings_fiducial.yaml --data_dir
training_data/asd_dataset_fiducial -out_name training_data/asd_dataset_fiducial/asds_O1_fiducial.hdf5
and a large ASD dataset:
dingo_generate_asd_dataset --settings_file asd_dataset_settings.yaml --data_dir
training_data/asd_dataset -out_name training_data/asd_dataset/asds_O1.hdf5
Step 3 Training the network
Now we are ready for training using GNPE. Here we need to train two networks, one which estimates the time of arrival in the detectors and one which does the full inference task. A natural question is why train two networks. The main idea is if one is able to align (and thus standardize) the times of arrival in the detectors, the inference task will become significantly easier. To do this we first need to train an initialization network which estimates the time of arrival in the detectors:
dingo_train --settings_file train_settings_init.yaml --train_dir training/init_network
Notice that the inference parameters are only the H1_time
and L1_time
. Also notice that the embedding_net
is significantly smaller and the number of flow steps, num_flow_steps
is reduced.
dingo_train --settings_file train_settings_main.yaml --train_dir training/main_network
Notice the data.gnpe_time_shifts
section. The kernel
describes how much to blur the GNPE proxies and is specified in
seconds. To read more about this see GNPE.
Step 4 Doing Inference
Performing inference requires a few changes to the previous NPE setup. Most notably, since we are now using GNPE, we
have to specify the file path to both the initialization network and the main network. Another
difference is the new attribute under sampler arguments num-gnpe-iterations
which indicates the
number of GNPE steps to take. If the initialization network is not fully converged or if
the length of the segment being analyzed is very long, it is recommended to increase this number.
dingo_pipe GW150914.ini
Inference on an injection
A simple example is creating an injection consistent with what the
network was trained on, and then running Dingo on it. First one can instantiate
the dingo.gw.injection.Injection
using the metadata from the
dingo.core.models.posterior_model.PosteriorModel
(the trained network). An ASD dataset also needs to be specified,
one can take the fiducial asd dataset the network was trained on.
from dingo.core.models import PosteriorModel
import dingo.gw.injection as injection
from dingo.gw.ASD_dataset.noise_dataset import ASDDataset
main_pm = PosteriorModel(
device="cuda",
model_filename="/path/to/main_network",
load_training_info=False
)
init_pm = PosteriorModel(
device='cuda',
model_filename="/path/to/init_network",
load_training_info=False
)
injection_generator = injection.Injection.from_posterior_model_metadata(main_pm.metadata)
asd_fname = main_pm.metadata["train_settings"]["training"]["stage_0"]["asd_dataset_path"]
asd_dataset = ASDDataset(file_name=asd_fname)
injection_generator.asd = {k:v[0] for k,v in asd_dataset.asds.items()}
intrinsic_parameters = {
"chirp_mass": 35,
"mass_ratio": 0.5,
"a_1": .3,
"a_2": .5,
"tilt_1": 0.,
"tilt_2": 0.,
"phi_jl": 0.,
"phi_12": 0.
}
extrinsic_parameters = {
'phase': 0.,
'theta_jn': 2.3,
'geocent_time': 0.,
'luminosity_distance': 400.,
'ra': 0.,
'dec': 0.,
'psi': 0.,
}
theta = {**intrinsic_parameters, **extrinsic_parameters}
strain_data = injection_generator.injection(theta)
Then one can create a injections and do inference on them.
from dingo.gw.inference.gw_samplers import GWSamplerGNPE, GWSampler
init_sampler = GWSampler(model=init_pm)
sampler = GWSamplerGNPE(model=main_pm, init_sampler=init_sampler, num_iterations=30)
sampler.context = strain_data
sampler.run_sampler(num_samples=50_000, batch_size=10_000)
result = sampler.to_result()
result.plot_corner()
Introduction to neural posterior estimation
In contrast to classical parameter estimation codes like Bilby and LALInference, Dingo uses simulation-based (or likelihood-free) inference. The basic idea is to train a neural network to represent the Bayesian posterior over source parameters given the observed data. Training is based on simulated data rather than likelihood evaluations. Neural posterior estimation (NPE) combines the ideas of simulation-based inference with conditional neural density estimators.
Normalizing flows
Normalizing flows provide a means to represent complicated probability distributions using neural networks, in a way that enables rapid sampling and density estimation. They represent the distribution in terms of a mapping (or flow) \(f: u \to \theta\) on the sample space from a much simpler “base” distribution, which we take to be standard normal (of the same dimension as the parameter space). If \(f\) is allowed to depend on observed data \(d\) (denoted \(f_d\)) then the flow describes a conditional probability distribution \(q(\theta | d)\). The PDF is given by the change of variables rule,
where \(D\) is the dimensionality of the parameter space.
A normalizing flow must satisfy the following properties:
Invertibility, so that one can evaluate \(f_d^{-1}(\theta)\) for any \(\theta\).
Simple Jacobian determinant, so that one can quickly evaluate \(\det f_d^{-1}(\theta)\).
With these properties, one can quickly evaluate the right-hand side of (1) to obtain the density. Various types of normalizing flow have been constructed to satisfy these properties, typically as a composition of relatively simple transforms \(f^{(j)}\). These relatively simple transforms are then parametrized by the output of a neural network. To sample \(\theta \sim q(\theta|d)\), one samples \(u \sim \mathcal N(0,1)^D\) and applies the flow in the forward direction.
For each flow step, Dingo uses a conditional coupling transform, meaning that half of the components are held fixed, and the other half transform elementwise, conditional on the untransformed components and the data,
If the elementwise functions \(f^{(j)}_i\) are differentiable, then it follows automatically that we have a normalizing flow. We use a neural spline flow, meaning that the functions \(f^{(j)}_i\) are splines, which in turn are parametrized by neural network outputs (taking as input \((u_{1:D/2},d)\)). Between each of these transforms, the parameters are randomly permuted, ensuring that the full flow is sufficiently flexible. Dingo uses the implementation of this entire structure provided by nflows.
Training
The conditional neural density estimator \(q(\theta | d)\) is initialized randomly and must be trained to become a good approximation to the posterior \(p(\theta | d)\). To achieve this, one must specify a target loss function to minimize. A reasonable starting point is to minimize the Kullback-Leibler (KL) divergence of \(p\) from \(q\),
This measures a deviation between the two distributions, and is notably not symmetric. (We take the so-called “forward” KL divergence, which is “mass-covering”.) Taking the expectation over data samples \(d \sim p(d)\), and dropping the numerator from the \(\log\) term (since it is independent of the network parameters), we arrive at the loss function
On the second line we used Bayes’ theorem \(p(d) p(\theta | d) = p(\theta) p(d | \theta)\) to re-order the integrations. The loss may finally be approximated on a mini-batch of samples,
where the samples are drawn ancestrally in a two-step process:
Sample from the prior, \(\theta^{(i)} \sim p(\theta)\),
Simulate data, \(d^{(i)} \sim p(d | \theta^{(i)})\),
We then take the gradient of \(L\) with respect to network parameters and minimize using the Adam optimizer.
Importantly, the process to generate training samples incorporates the same information as a standard (likelihood-based) sampler would use. Namely, the prior is incorporated by sampling parameters from it, and the likelihood is incorporated by simulating data. Bayes’ theorem is incorporated in going from line 1 to line 2 in (2). For gravitational waves, the likelihood is taken to be the probability that the residual when subtracting a signal \(h(\theta)\) from \(d\) is stationary Gaussian noise (with the measured PSD \(S_{\text{n}}(f)\) in the detector). Likewise, to simulate data we generate a waveform \(h(\theta^{(i)})\) and add a random noise realization \(n \sim \mathcal N(0, S_\text{n}(f))\). Ultimately, however, the SBI approach is more flexible, since in principle one could add non-stationary or non-Gaussian noise, and train the network to reproduce the posterior, despite not having a tractable likelihood. See the section on training data for additional details of training for gravitational wave inference.
Intuitively, one way to understand NPE is simply that we are doing supervised deep learning—inferring parameter labels from examples—but allowing for the flexibility to produce a probabilistic answer. With this flexibility, the network learns to produce the Bayesian posterior.
Code design
Reproducibility
Generating reproducible results must be central to any deep learning code. Dingo attempts to achieve this in the following ways:
Settings
There are a large number of configuration options that must be selected when using Dingo. These include
Waveform and noise dataset settings,
Training settings, including pre-processing, neural network, and training strategy settings,
Inference settings, including event time or injection data.
The Dingo approach is to save all of these settings as nested dictionaries together with the outputs of the various tasks. In practice, this means specifying the settings as a .yaml
file and passing this to a command-line script that runs some code and produces an output file (.hdf5
or .pt
). The output file then contains the settings dictionary (possibly augmented by additional derived parameters). All output files can be inspected using the command-line script dingo_ls
, which prints the stored settings and possibly additional information. The output from dingo_ls
could (with a small amount of effort) be used to reproduce the exact results (modulo random seeds, to be implemented).
In addition to saving the user-provided settings at each step, Dingo also saves the settings from precursor steps. For example, when training a model on data from a given waveform dataset, the waveform dataset settings are also saved along with the model settings. This can be very useful at a later point, when only the trained model is available, not the training data. Beyond ensuring reproducibility, having these precursor settings available is needed for certain downstream tasks (e.g., combining the intrinsic prior from a waveform dataset with the extrinsic prior specified for training).
Random seeds
To-do
Implement this.
Unique identifiers for models
To-do
Implement this.
Code re-use
core
and gw
packages
Although the only current use case for Dingo is to analyze LVK data, we hope that it can be extended to other GW or astrophysical (or more general scientific) applications. To facilitate this, we follow the Bilby approach of partitioning code into core
and gw
components: gw
contains GW-specific code (relating to waveforms, interferometers, etc.) whereas core
contains generic network architectures, data structures, samplers, etc., that we expect could be used in other applications. As we find ways to write elements of code in more generic ways, we hope to migrate additional components from gw
to core
. We could then envision future packages, e.g., for LISA inference, GW populations, or cosmology.
Data transforms
We follow the PyTorch guidelines of pre-processing data using a sequence of transforms. Dingo includes transforms for tasks such as sampling extrinsic parameters, projecting waveform polarizations to detectors, and adding noise. The same transforms are re-used at inference time, where a similar (but always identical) sequence is required. Some transforms also behave differently at inference time, and thus have a flag to specify the mode.
Data structures
Dingo uses several dataset classes, all of which inherit from dingo.core.dataset.DingoDataset
. This provides a common IO (to save/load from HDF5 as well as dictionaries). It also stores the settings dictionary as an attribute.
Command-line scripts
In general, Dingo is constructed around libraries and classes that are used to carry out various data processing tasks. There are a large number of configuration options, which are often passed as dictionaries, enabling the addition of new settings without breaking old code.
For very high-level tasks, such as generating a training dataset or training a network, we believe it is most straightforward to use a command-line interface. This is because these are end-user tasks that might be called by separate programs, or on a cluster, or because some of these (dataset generation and training) can be quite expensive.
A Dingo command-line script begins with the prefix dingo_
and is usually a thin wrapper around a function that could be called by other code if desired. It takes as input a .yaml
file, passes it as a dictionary to the function, obtains a result, and saves it to disk. We hope that this balance between libraries and a command-line interface enables an extensible code going forward.
Generating waveforms
Training data for Dingo consist of pairs of parameters \(\theta\) and corresponding simulated strain data sets \(d_I\), where \(I\) runs over the GW interferometers (L1, H1, V1, etc.). Additionally, when conditioning on detector noise properties, data also include noise context (the PSD \(S_{\text{n},I}\)). Strain data sets are of the form
where \(h_I(\theta)\) is a signal waveform (provided by a waveform model) and \(n_I\) is a noise realization (stationary and Gaussian, consistent with \(S_{\text{n}, I}\)).
Data domain
At present, Dingo works entirely with frequency domain data. Although NPE is very flexible and could in principle learn to interpret data in any representation, FD data are especially convenient because (1) stationary Gaussian noise is independent in each frequency bin, so noise generation is straightforward, (2) time shifts take a simple form, enabling improved data augmentation, and (3) the noise context is already in FD. Other domains could be useful in the future, however, so the code is written in a way that the domain could be adapted.
The domain is specified by instantiating a FrequencyDomain
,
from dingo.gw.domains import FrequencyDomain
domain = FrequencyDomain(f_min=20.0, f_max=1024.0, delta_f=0.125)
/home/docs/checkouts/readthedocs.org/user_builds/dingo-gw/envs/latest/lib/python3.10/site-packages/dingo/gw/__init__.py:3: UserWarning: Wswiglal-redir-stdio:
SWIGLAL standard output/error redirection is enabled in IPython.
This may lead to performance penalties. To disable locally, use:
with lal.no_swig_redirect_standard_output_error():
...
To disable globally, use:
lal.swig_redirect_standard_output_error(False)
Note however that this will likely lead to error messages from
LAL functions being either misdirected or lost when called from
Jupyter notebooks.
To suppress this warning, use:
import warnings
warnings.filterwarnings("ignore", "Wswiglal-redir-stdio")
import lal
import lal
Derived class properties include, e.g., the frequency grid. Frequency arrays run from 0 to f_max
, as is standard for GW data analysis software.
domain.sample_frequencies
array([0.000000e+00, 1.250000e-01, 2.500000e-01, ..., 1.023750e+03,
1.023875e+03, 1.024000e+03], dtype=float32)
Note
The window factor \(w\) used when FFTing from time domain data is also stored within the domain, in domain.window_factor
. This enters into the standard deviation of white noise in each frequency bin, domain.noise_std
. In frequency domain, this is given by \(\sqrt{w/4\delta f}\).
Various class methods also act on data, to perform operations such as zeroing below f_min
, truncating above f_max
, or applying a time shift:
- class dingo.gw.domains.FrequencyDomain(f_min: float, f_max: float, delta_f: float, window_factor: float | None = None)
Defines the physical domain on which the data of interest live.
The frequency bins are assumed to be uniform between [0, f_max] with spacing delta_f. Given a finite length of time domain data, the Fourier domain data starts at a frequency f_min and is zero below this frequency. window_kwargs specify windowing used for FFT to obtain FD data from TD data in practice.
- static add_phase(data, phase)
Add a (frequency-dependent) phase to a frequency series. Allows for batching, as well as additional channels (such as detectors). Accounts for the fact that the data could be a complex frequency series or real and imaginary parts.
Convention: the phase phi(f) is defined via exp(- 1j * phi(f)).
- Parameters:
data (Union[np.array, torch.Tensor]) –
phase (Union[np.array, torch.Tensor]) –
- Return type:
New array or tensor of the same shape as data.
- property delta_f: float
The frequency spacing of the uniform grid [Hz].
- property domain_dict
Enables to rebuild the domain via calling build_domain(domain_dict).
- property duration: float
Waveform duration in seconds.
- property f_max: float
The maximum frequency [Hz] is typically set to half the sampling rate.
- property f_min: float
The minimum frequency [Hz].
- property frequency_mask: ndarray
Mask which selects frequency bins greater than or equal to the starting frequency
- property frequency_mask_length: int
Number of samples in the subdomain domain[frequency_mask].
- get_sample_frequencies_astype(data)
Returns a 1D frequency array compatible with the last index of data array.
Decides whether array is numpy or torch tensor (and cuda vs cpu), and whether it contains the leading zeros below f_min.
- Parameters:
data (Union[np.array, torch.Tensor]) – Sample data
- Return type:
frequency array compatible with last index
- property noise_std: float
Standard deviation of the whitened noise distribution.
To have noise that comes from a multivariate unit normal distribution, you must divide by this factor. In practice, this means dividing the whitened waveforms by this.
TODO: This description makes some assumptions that need to be clarified. Windowing of TD data; tapering window has a slope -> reduces power only for noise, but not for the signal which is in the main part unaffected by the taper
- property sampling_rate: float
The sampling rate of the data [Hz].
- set_new_range(f_min: float | None = None, f_max: float | None = None)
Set a new range [f_min, f_max] for the domain. This is only allowed if the new range is contained within the old one.
- time_translate_data(data, dt)
Time translate frequency-domain data by dt. Time translation corresponds (in frequency domain) to multiplication by
\[\exp(-2 \pi i \, f \, dt).\]This method allows for multiple batch dimensions. For torch.Tensor data, allow for either a complex or a (real, imag) representation.
- Parameters:
data (array-like (numpy, torch)) –
Shape (B, C, N), where
B corresponds to any dimension >= 0,
C is either absent (for complex data) or has dimension >= 2 (for data represented as real and imaginary parts), and
N is either len(self) or len(self)-self.min_idx (for truncated data).
dt (torch tensor, or scalar (if data is numpy)) – Shape (B)
- Return type:
Array-like of the same form as data.
- update(new_settings: dict)
Update the domain with new settings. This is only allowed if the new settings are “compatible” with the old ones. E.g., f_min should be larger than the existing f_min.
- Parameters:
new_settings (dict) – Settings dictionary. Must contain a subset of the keys contained in domain_dict.
- update_data(data: ndarray, axis: int = -1, low_value: float = 0.0)
Adjusts data to be compatible with the domain:
Below f_min, it sets the data to low_value (typically 0.0 for a waveform, but for a PSD this might be a large value).
Above f_max, it truncates the data array.
- Parameters:
data (np.ndarray) – Data array
axis (int) – Which data axis to apply the adjustment along.
low_value (float) – Below f_min, set the data to this value.
- Returns:
The new data array.
- Return type:
np.ndarray
Waveform generator
Waveforms are generated using the WaveformGenerator
class (or its subclass NewInterfaceWaveformGenerator
, for employing the new LIGO waveform interface, needed for some approximants). This depends on a Domain
as well as a waveform approximant and a reference frequency f_ref
. In the backend, the WaveformGenerator
class calls LALSimulation functions (typically SimInspiralFD
) via the SWIG-Python interface. For time domain waveforms, SimInspiralFD
takes care of FFTing to frequency domain. The NewInterfaceWaveformGenerator
class calls the gwsignal module, a Python interface recently implemented in LALSimulation, which is needed for employing some of the latest waveform approximants, as the SEOBNRv5HM
and SEOBNRv5PHM
.
from dingo.gw.waveform_generator import WaveformGenerator #, NewInterfaceWaveformGenerator
wfg = WaveformGenerator(approximant='IMRPhenomXPHM', domain=domain, f_ref=20.0)
# wfg = NewInterfaceWaveformGenerator(approximant='SEOBNRv5PHM', domain=domain, f_ref=20.0)
Setting spin_conversion_phase = None. Using phase parameter for conversion to cartesian spins.
To generate a waveform we first need to choose parameters. Here we sample parameters from a bilby.core.prior.PriorDict
. We use the default Dingo intrinsic prior.
from bilby.core.prior import PriorDict
from dingo.gw.prior import default_intrinsic_dict
prior = PriorDict(default_intrinsic_dict)
prior
{'mass_1': Constraint(minimum=10.0, maximum=80.0, name=None, latex_label=None, unit=None),
'mass_2': Constraint(minimum=10.0, maximum=80.0, name=None, latex_label=None, unit=None),
'mass_ratio': bilby.gw.prior.UniformInComponentsMassRatio(minimum=0.125, maximum=1.0, name='mass_ratio', latex_label='$q$', unit=None, boundary=None, equal_mass=False),
'chirp_mass': bilby.gw.prior.UniformInComponentsChirpMass(minimum=25.0, maximum=100.0, name='chirp_mass', latex_label='$\\mathcal{M}$', unit=None, boundary=None),
'luminosity_distance': DeltaFunction(peak=1000.0, name=None, latex_label=None, unit=None),
'theta_jn': Sine(minimum=0.0, maximum=3.141592653589793, name=None, latex_label=None, unit=None, boundary=None),
'phase': Uniform(minimum=0.0, maximum=6.283185307179586, name=None, latex_label=None, unit=None, boundary='periodic'),
'a_1': Uniform(minimum=0.0, maximum=0.99, name=None, latex_label=None, unit=None, boundary=None),
'a_2': Uniform(minimum=0.0, maximum=0.99, name=None, latex_label=None, unit=None, boundary=None),
'tilt_1': Sine(minimum=0.0, maximum=3.141592653589793, name=None, latex_label=None, unit=None, boundary=None),
'tilt_2': Sine(minimum=0.0, maximum=3.141592653589793, name=None, latex_label=None, unit=None, boundary=None),
'phi_12': Uniform(minimum=0.0, maximum=6.283185307179586, name=None, latex_label=None, unit=None, boundary='periodic'),
'phi_jl': Uniform(minimum=0.0, maximum=6.283185307179586, name=None, latex_label=None, unit=None, boundary='periodic'),
'geocent_time': DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)}
p = prior.sample()
p
{'mass_ratio': 0.27516887747784635,
'chirp_mass': 75.96284973482983,
'luminosity_distance': 1000.0,
'theta_jn': 1.4424160368867687,
'phase': 3.5597919874340875,
'a_1': 0.6803566132772145,
'a_2': 0.1772403333232536,
'tilt_1': 2.4084579981751792,
'tilt_2': 1.5913639153680237,
'phi_12': 0.17224461804836214,
'phi_jl': 5.8646174013435814,
'geocent_time': 0.0}
Finally, we generate the waveform. This is returned as a dictionary, with entries for each polarization. This way of representing a sample is used throughout Dingo, and will be very convenient when applying transforms (to apply extrinsic parameters, add noise, etc.).
h = wfg.generate_hplus_hcross(p)
h
{'h_plus': array([0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j]),
'h_cross': array([0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j])}
import matplotlib.pyplot as plt
plt.plot(domain.sample_frequencies, h['h_plus'].real, label='real')
plt.plot(domain.sample_frequencies, h['h_plus'].imag, label='imag')
plt.xlim((10,1024))
plt.xscale('log')
plt.legend()
plt.xlabel('f')
plt.ylabel(r'$h_+$')
plt.show()

Note that the waveform is nonzero slightly below f_min
. This simply arises from the model implementation in LALSimulation
. When training networks, input data will be truncated below f_min
.
The complete specification of the WaveformGenerator
class is given as
- class dingo.gw.waveform_generator.WaveformGenerator(approximant: str, domain: Domain, f_ref: float, f_start: float | None = None, mode_list: List[Tuple] | None = None, transform=None, spin_conversion_phase=None, **kwargs)
Generate polarizations using LALSimulation routines in the specified domain for a single GW coalescence given a set of waveform parameters.
- Parameters:
approximant (str) – Waveform “approximant” string understood by lalsimulation This is defines which waveform model is used.
domain (Domain) – Domain object that specifies on which physical domain the waveform polarizations will be generated, e.g. Fourier domain, time domain.
f_ref (float) – Reference frequency for the waveforms
f_start (float) – Starting frequency for waveform generation. This is optional, and if not included, the starting frequency will be set to f_min. This exists so that EOB waveforms can be generated starting from a lower frequency than f_min.
mode_list (List[Tuple]) – A list of waveform (ell, m) modes to include when generating the polarizations.
spin_conversion_phase (float = None) – Value for phiRef when computing cartesian spins from bilby spins via bilby_to_lalsimulation_spins. The common convention is to use the value of the phase parameter here, which is also used in the spherical harmonics when combining the different modes. If spin_conversion_phase = None, this default behavior is adapted. For dingo, this convention for the phase parameter makes it impossible to treat the phase as an extrinsic parameter, since we can only account for the change of phase in the spherical harmonics when changing the phase (in order to also change the cartesian spins – specifically, to rotate the spins by phase in the sx-sy plane – one would need to recompute the modes, which is expensive). By setting spin_conversion_phase != None, we impose the convention to always use phase = spin_conversion_phase when computing the cartesian spins.
- generate_FD_modes_LO(parameters)
Generate FD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_fd (dict) – Dictionary with (l,m) as keys and the corresponding FD modes in lal format as values.
iota (float)
- generate_FD_waveform(parameters_lal: Tuple) Dict[str, ndarray]
Generate Fourier domain GW polarizations (h_plus, h_cross).
- Parameters:
parameters_lal – A tuple of parameters for the lalsimulation waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_TD_modes_L0(parameters)
Generate TD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_td (dict) – Dictionary with (l,m) as keys and the corresponding TD modes in lal format as values.
iota (float)
- generate_TD_waveform(parameters_lal: Tuple) Dict[str, ndarray]
Generate time domain GW polarizations (h_plus, h_cross)
- Parameters:
parameters_lal – A tuple of parameters for the lalsimulation waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_hplus_hcross(parameters: Dict[str, float], catch_waveform_errors=True) Dict[str, ndarray]
Generate GW polarizations (h_plus, h_cross).
If the generation of the lalsimulation waveform fails with an “Input domain error”, we return NaN polarizations.
Use the domain, approximant, and mode_list specified in the constructor along with the waveform parameters to generate the waveform polarizations.
- Parameters:
parameters (Dict[str, float]) –
A dictionary of parameter names and scalar values. The parameter dictionary must include the following keys. For masses, spins, and distance there are multiple options.
- Mass: (mass_1, mass_2) or a pair of quantities from
((chirp_mass, total_mass), (mass_ratio, symmetric_mass_ratio))
- Spin:
(a_1, a_2, tilt_1, tilt_2, phi_12, phi_jl) if precessing binary or (chi_1, chi_2) if the binary has aligned spins
Reference frequency: f_ref at which spin vectors are defined Extrinsic:
Distance: one of (luminosity_distance, redshift, comoving_distance) Inclination: theta_jn Reference phase: phase Geocentric time: geocent_time (GPS time)
- The following parameters are not required:
Sky location: ra, dec, Polarization angle: psi
- Units:
Masses should be given in units of solar masses. Distance should be given in megaparsecs (Mpc). Frequencies should be given in Hz and time in seconds. Spins should be dimensionless. Angles should be in radians.
catch_waveform_errors (bool) – Whether to catch lalsimulation errors
- Returns:
A dictionary of generated waveform polarizations
- Return type:
wf_dict
- generate_hplus_hcross_m(parameters: Dict[str, float]) Dict[tuple, Dict[str, ndarray]]
Generate GW polarizations (h_plus, h_cross), separated into contributions from the different modes. This method is identical to self.generate_hplus_hcross, except that it generates the individual contributions of the modes to the polarizations and sorts these according to their transformation behavior (see below), instead of returning the overall sum.
This is useful in order to treat the phase as an extrinsic parameter. Instead of {“h_plus”: hp, “h_cross”: hc}, this method returns a dict in the form of {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]}. Each key m contains the contribution to the polarization that transforms according to exp(-1j * m * phase) under phase transformations (due to the spherical harmonics).
- Note:
pol_m[m] contains contributions of the m modes and and the -m modes. This is because the frequency domain (FD) modes have a positive frequency part which transforms as exp(-1j * m * phase), while the negative frequency part transforms as exp(+1j * m * phase). Typically, one of these dominates [e.g., the (2,2) mode is dominated by the negative frequency part and the (-2,2) mode is dominated by the positive frequency part] such that the sum of (l,|m|) and (l,-|m|) modes transforms approximately as exp(1j * |m| * phase), which is e.g. used for phase marginalization in bilby/lalinference. However, this is not exact. In this method we account for this effect, such that each contribution pol_m[m] transforms exactly as exp(-1j * m * phase).
Phase shifts contribute in two ways: Firstly via the spherical harmonics, which we account for with the exp(-1j * m * phase) transformation. Secondly, the phase determines how the PE spins transform to cartesian spins, by rotating (sx,sy) by phase. This is not accounted for in this function. Instead, the phase for computing the cartesian spins is fixed to self.spin_conversion_phase (if not None). This effectively changes the PE parameters {phi_jl, phi_12} to parameters {phi_jl_prime, phi_12_prime}. For parameter estimation, a postprocessing operation can be applied to account for this, {phi_jl_prime, phi_12_prime} -> {phi_jl, phi_12}. See also documentation of __init__ method for more information on self.spin_conversion_phase.
Differences to self.generate_hplus_hcross: - We don’t catch errors yet TODO - We don’t apply transforms yet TODO
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
pol_m – Dictionary with contributions to h_plus and h_cross, sorted by their transformation behaviour under phase shifts: {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]} Each contribution h_m transforms as exp(-1j * m * phase) under phase shifts (for fixed self.spin_conversion_phase, see above).
- Return type:
dict
- setup_mode_array(mode_list: List[Tuple]) Dict
Define a mode array to select waveform modes to include in the polarizations from a list of modes.
- Parameters:
mode_list (a list of (ell, m) modes) –
- Returns:
A lal parameter dictionary
- Return type:
lal_params
Waveform modes
Add later.
Building a waveform dataset
For training neural networks, the more training samples the better. With too little training data, one runs the risk of overfitting. Waveforms, however, can be expensive to generate and take up significant storage. Dingo adopts several strategies to mitigate these problems:
Dingo partitions parameters into two types—intrinsic and extrinsic—and builds a training set based only on the intrinsic parameters. This consists of waveform polarizations \(h_+\) and \(h_\times\). Extrinsic parameters are selected during training, and applied to generate the detector waveforms \(h_I\). This augments the training set to provide unlimited samples from the extrinsic parameters.
Saved waveforms are compressed using a singular value decomposition. Although this is lossy, waveform mismatches can monitored to ensure that they fall below the intrinsic error in the waveform model.
The WaveformDataset
class
The WaveformDataset
is a storage container for waveform polarizations and parameters, which can used to serve samples to a neural network during training:
- class dingo.gw.dataset.WaveformDataset(file_name=None, dictionary=None, transform=None, precision=None, domain_update=None, svd_size_update=None)
Bases:
DingoDataset
,Dataset
This class stores a dataset of waveforms (polarizations) and corresponding parameters.
It can load the dataset either from an HDF5 file or suitable dictionary.
Once a waveform data set is in memory, the waveform data are consumed through a __getitem__() call, optionally applying a chain of transformations, which are classes that implement a __call__() method.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The dictionary keys should be ‘settings’, ‘parameters’, and ‘polarizations’.
transform (Transform) – Transform to be applied to dataset samples when accessed through __getitem__
precision (str ('single', 'double')) – If provided, changes precision of loaded dataset.
domain_update (dict) – If provided, update domain from existing domain using new settings.
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- initialize_decompression(svd_size_update: int | None = None)
Sets up decompression transforms. These are applied to the raw dataset before self.transform. E.g., SVD decompression.
- Parameters:
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- load_supplemental(domain_update=None, svd_size_update=None)
Method called immediately after loading a dataset.
Creates (and possibly updates) domain, updates dtypes, and initializes any decompression transform. Also zeros data below f_min, and truncates above f_max.
- Parameters:
domain_update (dict) – If provided, update domain from existing domain using new settings.
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- update_domain(domain_update: dict | None = None)
Update the domain based on new configuration.
The waveform dataset provides waveform polarizations in a particular domain. In Frequency domain, this is [0, domain._f_max]. Furthermore, data is set to 0 below domain._f_min. In practice one may want to train a network based on slightly different domain settings, which corresponds to truncating the likelihood integral.
This method provides functionality for that. It truncates and/or zeroes the dataset to the range specified by the domain, by calling domain.update_data.
- Parameters:
domain_update (dict) – Settings dictionary. Must contain a subset of the keys contained in domain_dict.
WaveformDataset
subclasses dingo.core.dataset.DingoDataset
and torch.utils.data.Dataset
. The former provides generic functionality for saving and loading datasets as HDF5 files and dictionaries, and is used in several components of Dingo. The latter allows the WaveformDataset
to be used with a PyTorch DataLoader
. In general, we follow the PyTorch design framework for training, including Datasets, DataLoaders, and Transforms.
Generating a simple dataset
As described above, the WaveformDataset
class is just a container, and does not generate the contents itself. Dataset generation is instead carried out using functions in the dingo.gw.dataset.generate_dataset
module. Although in practice, datasets are likely to be generated from a settings file using the command line interface, here we describe how to generate one interactively.
A dataset is based on an intrinsic prior and a waveform generator, so we build these as described here.
from dingo.gw.waveform_generator import WaveformGenerator
from bilby.core.prior import PriorDict
from dingo.gw.prior import default_intrinsic_dict
from dingo.gw.domains import FrequencyDomain
domain = FrequencyDomain(f_min=20.0, f_max=1024.0, delta_f=0.125)
wfg = WaveformGenerator(approximant='IMRPhenomXPHM', domain=domain, f_ref=20.0)
prior = PriorDict(default_intrinsic_dict)
/home/docs/checkouts/readthedocs.org/user_builds/dingo-gw/envs/latest/lib/python3.10/site-packages/dingo/gw/__init__.py:3: UserWarning: Wswiglal-redir-stdio:
SWIGLAL standard output/error redirection is enabled in IPython.
This may lead to performance penalties. To disable locally, use:
with lal.no_swig_redirect_standard_output_error():
...
To disable globally, use:
lal.swig_redirect_standard_output_error(False)
Note however that this will likely lead to error messages from
LAL functions being either misdirected or lost when called from
Jupyter notebooks.
To suppress this warning, use:
import warnings
warnings.filterwarnings("ignore", "Wswiglal-redir-stdio")
import lal
import lal
Setting spin_conversion_phase = None. Using phase parameter for conversion to cartesian spins.
We can use the following function to generate sets of parameters and associated waveforms:
from dingo.gw.dataset.generate_dataset import generate_parameters_and_polarizations
parameters, polarizations = generate_parameters_and_polarizations(wfg,
prior,
num_samples=100,
num_processes=1)
Generating dataset of size 100
parameters
mass_ratio | chirp_mass | luminosity_distance | theta_jn | phase | a_1 | a_2 | tilt_1 | tilt_2 | phi_12 | phi_jl | geocent_time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.302823 | 99.704171 | 1000.0 | 1.045181 | 1.956625 | 0.624675 | 0.864143 | 2.053759 | 0.897406 | 4.995902 | 1.026707 | 0.0 |
1 | 0.208840 | 62.787584 | 1000.0 | 1.147821 | 1.231156 | 0.523501 | 0.457318 | 1.689699 | 2.109647 | 1.568618 | 0.603215 | 0.0 |
2 | 0.167807 | 75.470319 | 1000.0 | 2.248237 | 1.657892 | 0.084256 | 0.106429 | 1.648166 | 2.215986 | 0.030520 | 5.517168 | 0.0 |
3 | 0.885713 | 71.122514 | 1000.0 | 2.500974 | 3.249277 | 0.151447 | 0.338559 | 0.801100 | 1.155492 | 3.671084 | 3.230101 | 0.0 |
4 | 0.602687 | 78.836732 | 1000.0 | 2.421702 | 3.511744 | 0.920802 | 0.570441 | 2.646145 | 0.735851 | 3.358571 | 0.196351 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
95 | 0.710777 | 87.396051 | 1000.0 | 2.460913 | 2.277778 | 0.310158 | 0.221775 | 2.112918 | 0.499147 | 1.131839 | 3.146899 | 0.0 |
96 | 0.540317 | 50.685929 | 1000.0 | 1.736872 | 2.947742 | 0.105328 | 0.923134 | 1.103193 | 1.681197 | 2.727364 | 1.440938 | 0.0 |
97 | 0.295335 | 56.577308 | 1000.0 | 2.340868 | 1.890975 | 0.720159 | 0.634283 | 2.004960 | 1.563219 | 3.219694 | 2.781885 | 0.0 |
98 | 0.390020 | 94.408416 | 1000.0 | 1.719763 | 0.986305 | 0.968500 | 0.747774 | 1.329375 | 1.642715 | 3.340588 | 2.788269 | 0.0 |
99 | 0.286339 | 69.640219 | 1000.0 | 2.093173 | 3.837138 | 0.229537 | 0.666358 | 1.686112 | 0.907684 | 3.178193 | 5.533587 | 0.0 |
100 rows × 12 columns
polarizations
{'h_plus': array([[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
...,
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j]]),
'h_cross': array([[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
...,
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j],
[0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j]])}
We can then put these in a WaveformDataset
,
from dingo.gw.dataset import WaveformDataset
dataset_dict = {'parameters': parameters, 'polarizations':polarizations}
wfd = WaveformDataset(dictionary=dataset_dict)
Samples can then be easily indexed,
wfd[0]
{'parameters': {'mass_ratio': 0.3028225394903101,
'chirp_mass': 99.70417093459808,
'luminosity_distance': 1000.0,
'theta_jn': 1.0451812800940419,
'phase': 1.956625056491646,
'a_1': 0.6246749028641134,
'a_2': 0.8641430543194487,
'tilt_1': 2.053758607319084,
'tilt_2': 0.8974055366142486,
'phi_12': 4.995902277732458,
'phi_jl': 1.0267065217222517,
'geocent_time': 0.0},
'waveform': {'h_plus': array([0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j]),
'h_cross': array([0.+0.j, 0.+0.j, 0.+0.j, ..., 0.+0.j, 0.+0.j, 0.+0.j])}}
Note
The sample is represented as a nested dictionary. This is a standard format for Dingo.
Automated dataset construction
The simple dataset constructed above is useful for illustrative purposes, but it lacks the several important features:
Waveforms are not compressed. A dataset with many samples would therefore take up enormous storage space.
Not reproducible. The dataset contains no metadata describing its construction (e.g., waveform approximant, domain, prior, …).
The generate_dataset
function automates all of these advanced features:
- dingo.gw.dataset.generate_dataset.generate_dataset(settings: Dict, num_processes: int) WaveformDataset
Generate a waveform dataset.
- Parameters:
settings (dict) – Dictionary of settings to configure the dataset
num_processes (int) –
- Return type:
A WaveformDataset based on the settings.
This function is in turn wrapped by the command-line functions dingo_generate_dataset
and dingo_generate_dataset_dag
. These take a .yaml
file with the same contents as the settings dictionary.
Configuration
A typical settings dictionary / .yaml
config file takes the following form, described in detail below:
domain:
type: FrequencyDomain
f_min: 20.0
f_max: 1024.0
delta_f: 0.125
waveform_generator:
approximant: IMRPhenomXPHM
f_ref: 20.0
# f_start: 15.0 # Optional setting useful for EOB waveforms. Overrides f_min when generating waveforms.
# new_interface: true # Optional setting for employing new waveform interface. This is needed for SEOBNRv5 approximants, and optional for standard LAL approximants.
spin_conversion_phase: 0.0
# Dataset only samples over intrinsic parameters. Extrinsic parameters are chosen at train time.
intrinsic_prior:
mass_1: bilby.core.prior.Constraint(minimum=10.0, maximum=80.0)
mass_2: bilby.core.prior.Constraint(minimum=10.0, maximum=80.0)
chirp_mass: bilby.gw.prior.UniformInComponentsChirpMass(minimum=25.0, maximum=100.0)
mass_ratio: bilby.gw.prior.UniformInComponentsMassRatio(minimum=0.125, maximum=1.0)
phase: default
a_1: bilby.core.prior.Uniform(minimum=0.0, maximum=0.99)
a_2: bilby.core.prior.Uniform(minimum=0.0, maximum=0.99)
tilt_1: default
tilt_2: default
phi_12: default
phi_jl: default
theta_jn: default
# Reference values for fixed (extrinsic) parameters. These are needed to generate a waveform.
luminosity_distance: 100.0 # Mpc
geocent_time: 0.0 # s
# Dataset size
num_samples: 5000000
# Save a compressed representation of the dataset
compression:
svd:
# Truncate the SVD basis at this size. No truncation if zero.
size: 200
num_training_samples: 50000
num_validation_samples: 10000
whitening: aLIGO_ZERO_DET_high_P_asd.txt
- domain
Specifies the data domain. Currenly only
FrequencyDomain
is implemented.- waveform_generator
Choose the approximant and reference frequency. For EOB models that require time integration, it is usually necessary to specify a lower starting frequency. In this case,
f_ref
is ignored.- spin_conversion_phase (optional)
Value for
phiRef
when converting PE spins to Cartesian spins viabilby_to_lalsimulation_spins
. When set toNone
(default), this uses thephase
parameter. When set to 0.0,phase
only refers to the azimuthal observation angle, allowing for it to be treated as an extrinsic parameter.Important
It is necessary to set this to 0.0 if planning to train a
phase
-marginalized network, and then reconstruct thephase
synthetically.
- intrinsic_prior
Specify the prior over intrinsic parameters. Intrinsic parameters here refer to those parameters that are needed to generate waveform polarizations. Extrinsic parameters here refer to those parameters that can be sampled and applied rapidly during training. As shown in the example, it is also possible to specify
default
priors, which is convenient for certain parameters. These are listed indingo.gw.prior.default_intrinsic_dict
.Intrinsic parameters obviously include masses and spins, but also inclination, reference phase, luminosity distance, and time of coalescense at geocenter. Although inclination and phase are often considered extrinsic parameters, they are needed to generate waveform polarizations and cannot be easily transformed.
Luminosity distance and time of coalescense are considered as both intrinsic and extrinsic. Indeed they are needed to generate polarizations, but they can also be easily transformed during training to augment the dataset. We therefore fix them to fiducial values for generating polarizations.
- num_samples
The number of samples to include in the dataset. For a production model, we typically use \(5 \times 10^6\) samples.
- compression (optional)
How to compress the dataset.
- svd (optional)
Construct an SVD basis based on a specified number of additional samples. Save the main dataset in terms of its SVD basis coefficients. The number of elements in the basis is specified by the
size
setting. The performance of the basis is also evaluated in terms of the mismatch against a number of validation samples. All of the validation information, as well as the basis itself, is saved along with the waveform dataset.- whitening (optional)
Whether to save whitened waveforms, and in particular, whether to construct the basis based on whitened waveforms. The basis will be more efficient if whitening is used to adapt it to the detector noise characteristics. To use whitening, simply specify the desired ASD do use, from the Bilby list of ASDs. Note that the whitening is used only for the internal storage of the dataset. When accessing samples from the dataset, they will be unwhitened.
Dataset compression is implemented internally by setting the
WaveformGenerator.transform
operator, so that elements are compressed immediately after generation (avoiding the need to store many uncompressed waveforms in memory). Likewise, decompression is implemented by setting theWaveformDataset.decompression_transform
operator to apply the inverse transformation. This will act on samples to decompress them when accessed throughWaveformDataset.__getitem__()
.
Important
The automated dataset constructors store the configuration settings in WaveformDataset.settings
. This is so that the settings can be accessed by more downstream tasks, and for reference.
Command-line interface
In most cases the command-line interface will be used to generate a dataset. Given a settings file, one can call
dingo_generate_dataset --settings_file settings.yaml
--num_processes N
--out_file waveform_dataset.hdf5
This will generate a dataset following the configuration in settings.yaml
and save it as waveform_dataset.hdf5
, using N
processes.
To inspect the dataset (or any other Dingo-generated file) use
dingo_ls waveform_dataset.hdf5
This will print the configuration settings, as well as a summary of the SVD compression performance (if available).
For larger datasets, or those based on slower waveform models, Dingo includes a script that builds a condor DAG, dingo_generate_dataset_dag
. This splits the generation of waveforms across several nodes, and then reconstitutes the final dataset.
Data pre-processing
A sample from a WaveformDataset
consists of labeled waveform polarizations \((\theta_{\text{intrinsic}}, (h_+,h_\times))\), represented as a nested dictionary. This must be transformed into noisy detector data \(d_I\) (with additional noise context data) in a form suitable for input to a neural network. Dingo accomplishes this by applying a sequence of transforms to the sample.
A transform is simply a class with a __call__()
method, which takes a sample as input and returns a transformed sample. A sequence of transforms can be then be composed to build a more complex transform in a modular way. Dingo’s training transform sequence is stored as WaveformDataset.transform
, and is applied automatically when elements are accessed through indexing.
GW transform sequence
For Dingo, the flowchart below indicates the sequence of transforms applied to a sample from a WaveformDataset
.
Flowchart for Dingo data-preprocessing pipeline for training, starting from a sample from a WaveformDataset
. Transforms with rounded corners include an element of randomness, whereas trapezoidal items are deterministic.
Important
Some pre-processing transforms include an element of randomness. This serves to augment the training data and reduce overfitting.
Extrinsic parameters
The starting point for this chain of transforms is a sample sample
with parameters
and polarizations
sub-dictionaries. The first transform samples the extrinsic parameters, and adds a new sub-dictionary extrinsic_parameters
to sample
. Extrinsic parameters include sky position (right ascension, declination), polarization, time of coalescense, and luminosity distance (the latter two of which are also considered intrinsic parameters).
- class dingo.gw.transforms.SampleExtrinsicParameters(extrinsic_prior_dict)
Sample extrinsic parameters and add them to sample in a separate dictionary.
Detector waveforms
The next sequence of transforms applies the extrinsic parameters to sample["polarizations"]
to produce detector waveforms in sample["waveform"]
. First it calculates the arrival time \(t_I\) of the waveform in each detector, based on the time of coalescense at geocenter and the sky position, and stores this in sample["extrinsic_parameters"]
,
- class dingo.gw.transforms.GetDetectorTimes(ifo_list, ref_time)
Compute the time shifts in the individual detectors based on the sky position (ra, dec), the geocent_time and the ref_time.
Important
Dingo models are trained for a fixed set of detectors. This must be selected prior to training, and a new model must be trained if one wishes to analyze data in a different set of detectors. Thus, e.g., separate models must be trained for HL and HLV configurations.
Note
During training, Dingo fixes the orientation of the Earth (and corresponding interferometer positions and orientations) to that at a fixed reference time ref_time
. This is so that the model does not have to learn about the rotation of the Earth. This is corrected in post-processing by shifting the inferred right ascension by the difference between the true and reference sidereal times.
Optionally, the times \(t_I\) are perturbed to give new “proxy times” as part of the GNPE algorithm.
- class dingo.gw.transforms.GNPECoalescenceTimes(ifo_list, kernel, exact_global_equivariance=True, inference=False)
GNPE [1] Transformation for detector coalescence times.
For each of the detector coalescence times, a proxy is generated by adding a perturbation epsilon from the GNPE kernel to the true detector time. This proxy is subtracted from the detector time, such that the overall time shift only amounts to -epsilon in training. This standardizes the input data to the inference network, since the applied time shifts are always restricted to the range of the kernel.
To preserve information at inference time, conditioning of the inference network on the proxies is required. To that end, the proxies are stored in sample[ ‘gnpe_proxies’].
We can enforce an exact equivariance under global time translations, by subtracting one proxy (by convention: the first one, usually for H1 ifo) from all other proxies, and from the geocent time, see [1]. This is enabled with the flag exact_global_equivariance.
Note that this transform does not modify the data itself. It only determines the amount by which to time-shift the data.
[1]: arxiv.org/abs/2111.13139
- Parameters:
ifo_list (bilby.gw.detector.InterferometerList) – List of interferometers.
kernel (str) – Defines a Bilby prior, to be used for all interferometers.
exact_global_equivariance (bool = True) – Whether to impose the exact global time translation symmetry.
inference (bool = False) – Whether to use inference or training mode.
Finally, the detector waveforms \(h_I\) are calculated from the extrinsic parameters. (In the backend, these transforms use the Bilby interferometer libraries.) The contents of the extrinsic_parameters
sub-dictionary are then moved into sample["parameters"]
; this was essentially a holding place for parameters not yet applied to the waveform.
- class dingo.gw.transforms.ProjectOntoDetectors(ifo_list, domain, ref_time)
Project the GW polarizations onto the detectors in ifo_list. This does not sample any new parameters, but relies on the parameters provided in sample[‘extrinsic_parameters’]. Specifically, this transform applies the following operations:
Rescale polarizations to account for sampled luminosity distance
Project polarizations onto the antenna patterns using the ref_time and the extrinsic parameters (ra, dec, psi)
Time shift the strains in the individual detectors according to the times <ifo.name>_time provided in the extrinsic parameters.
Noise
Once the detector waveforms have been obtained, noise \(n_I\) must be added to simulate realistic data. First, noise ASDs are selected randomly for each detector from an ASDDataset
for the relevant observing run. This is stored in sample["asds"]
. For details see ASD dataset.
- class dingo.gw.transforms.SampleNoiseASD(asd_dataset)
Sample a random asds for each detector and add them to sample[‘asds’].
The waveform is then whitened based on the PSD, and furthermore scaled by the standard deviation of white noise. This is so that each input to the network will have unit variance, which is important for successful training.
- class dingo.gw.transforms.WhitenAndScaleStrain(scale_factor)
Whiten the strain data by dividing w.r.t. the corresponding asds, and scale it with 1/scale_factor.
In uniform frequency domain the scale factor should be np.sqrt(window_factor) / np.sqrt(4.0 * delta_f). It has two purposes:
(*) the denominator accounts for frequency binning (*) dividing by window factor accounts for windowing of strain data
For whitened waveforms, noise is white, so finally this is randomly sampled and added to sample["waveform"]
.
- class dingo.gw.transforms.AddWhiteNoiseComplex
Adds white noise with a standard deviation determined by self.scale to the complex strain data.
Output
The final set of transforms prepares the sample for input to the neural network. First, the desired inference parameters are selected. By taking only a subset of parameters
, one can train a marginalized posterior model. These parameters are also standardized to have zero mean and unit variance to improve training. (Standardization will be undone in post-processing after inference.) The parameters will then be repackaged into a numpy.ndarray
, so that parameter labels are implicit based on ordering.
- class dingo.gw.transforms.SelectStandardizeRepackageParameters(parameters_dict, standardization_dict, inverse=False, as_type=None, device='cpu')
This transformation selects the parameters in standardization_dict, normalizes them by setting p = (p - mean) / std, and repackages the selected parameters to a numpy array.
- as_type: str = None
only applies, if self.inverse == True * if None, data type is kept * if ‘dict’, dict with * if ‘pandas’, use pandas.DataFrame
The waveform
and asds
dictionaries are also repackaged into a single array of shape suitable for input to the network. In particular, the complex frequency domain strain data are decomposed into real and imaginary parts.
- class dingo.gw.transforms.RepackageStrainsAndASDS(ifos, first_index=0)
Repackage the strains and the asds into an [num_ifos, 3, num_bins] dimensional tensor. Order of ifos is provided by self.ifos. By convention, [:,i,:] is used for:
i = 0: strain.real i = 1: strain.imag i = 2: 1 / (asd * 1e23)
Finally, the samples
dictionary of arrays is unpacked to a tuple of arrays for parameters and data.
- class dingo.gw.transforms.UnpackDict(selected_keys)
Unpacks the dictionary to prepare it for final output of the dataloader. Only returns elements specified in selected_keys.
When used with a torch DataLoader
, the final numpy arrays are automatically transformed into torch tensors.
Building the transforms
The following function will set the transform
property of a WaveformDataset
to the above transform sequence:
- dingo.gw.training.set_train_transforms(wfd, data_settings, asd_dataset_path, omit_transforms=None)
Set the transform attribute of a waveform dataset based on a settings dictionary. The transform takes waveform polarizations, samples random extrinsic parameters, projects to detectors, adds noise, and formats the data for input to the neural network. It also implements optional GNPE transformations.
Note that the WaveformDataset is modified in-place, so this function returns nothing.
- Parameters:
wfd (WaveformDataset) –
data_settings (dict) –
asd_dataset_path (str) – Path corresponding to the ASD dataset used to generate noise.
omit_transforms – List of sub-transforms to omit from the full composition.
The various options are specified by passing an appropriate data_settings
dictionary. In practice, these settings will be specified along with other training settings.
data_settings
dictionary for configuring a sequence of training transforms. This dictionary includes several options not needed for set_train_transforms
, but which are needed as part of other training settings.waveform_dataset_path: /path/to/waveform_dataset.hdf5 # Contains intrinsic waveforms
train_fraction: 0.95
window: # Needed to calculate window factor for simulated data
type: tukey
f_s: 4096
T: 8.0
roll_off: 0.4
domain_update:
f_min: 20.0
f_max: 1024.0
svd_size_update: 200 # Optionally, reduce the SVD size when decompressing (for performance)
detectors:
- H1
- L1
extrinsic_prior: # Sampled at train time
dec: default
ra: default
geocent_time: bilby.core.prior.Uniform(minimum=-0.10, maximum=0.10)
psi: default
luminosity_distance: bilby.core.prior.Uniform(minimum=100.0, maximum=1000.0)
ref_time: 1126259462.391
gnpe_time_shifts:
kernel: bilby.core.prior.Uniform(minimum=-0.001, maximum=0.001)
exact_equiv: True
inference_parameters: default
- waveform_dataset_path
Points to the waveform dataset.
- train_fraction
Fraction of waveform dataset to be used for training. The remainder are used to compute the test loss.
- window
Specifies the window function to use when FFTing the time-domain data. It is used here to calculate a window factor for simulating data. See the discussion here.
- domain_update (optional)
Optionally specify new domain properties. These will update the domain associated to the
WaveformDataset
. They must necessarily describe a domain contained within the original.- svd_size_update (optional)
If the
WaveformDataset
uses SVD compression, optionally use a smaller number of basis elements than stored in the dataset. Decompression of the waveforms is the slowest preprocessing operation, so using this option can improve training speed at the expense of accuracy.- detectors
Set the desired GW interferometers for the Dingo model.
- extrinsic_prior
Specify the extrinsic prior. Default options are available.
- ref_time
Reference time for the interferometer locations and orientations. See the important note above.
- gnpe_time_shifts (optional)
GNPE kernel and additional options. See GNPE.
- inference_parameters
Parameters to infer with the model. At present they must be a subset of
sample["parameters"]
. By specifying a strict subset, this can be used to marginalize over parameters. Thedefault
setting points todingo.gw.prior.default_inference_parameters
:
from dingo.gw.prior import default_inference_parameters
default_inference_parameters
/home/docs/checkouts/readthedocs.org/user_builds/dingo-gw/envs/latest/lib/python3.10/site-packages/dingo/gw/__init__.py:3: UserWarning: Wswiglal-redir-stdio:
SWIGLAL standard output/error redirection is enabled in IPython.
This may lead to performance penalties. To disable locally, use:
with lal.no_swig_redirect_standard_output_error():
...
To disable globally, use:
lal.swig_redirect_standard_output_error(False)
Note however that this will likely lead to error messages from
LAL functions being either misdirected or lost when called from
Jupyter notebooks.
To suppress this warning, use:
import warnings
warnings.filterwarnings("ignore", "Wswiglal-redir-stdio")
import lal
import lal
['chirp_mass',
'mass_ratio',
'phase',
'a_1',
'a_2',
'tilt_1',
'tilt_2',
'phi_12',
'phi_jl',
'theta_jn',
'luminosity_distance',
'geocent_time',
'ra',
'dec',
'psi']
Detector noise
During training, simulated noise \(n_I\) is added to waveforms \(h_I(\theta)\) measured in detectors to produce realistic simulated data,
Dingo assumes this noise to be stationary and Gaussian, thus it is independent in each frequency bin, with variance given by some power spectral density (PSD).
Important
Similar to extrinsic parameters, detector noise is repeatedly sampled during training and added to the simulated signal. This augments the training set with new noise realizations for each epoch, reducing overfitting.
Although noise is mostly stationary and Guassian during an LVK observing run, the PSD in each detector does tend to drift from event to event. In a usual likelihood-based PE run, this is taken into account by estimating the PSD at the time of the event (either using Welch’s method on signal-free data surrounding the event, or at the same time as the event using BayesWave), and using this in the likelihood integral.
Dingo also estimates the PSD just prior to an event and uses this at inference time in two ways:
It whitens the data with respect to this PSD.
It provides the PSD (or rather, the inverse ASD) as context to the neural network.
A suitably trained model can therefore make use of the PSD as needed to generate the posterior.
ASD dataset
To train a model to perform inference conditioned on the noise PSD, it is necessary to not just sample random noise realizations for a given PSD, but also sample the PSD from a distribution for a given observing run. Training in this way is necessary to perform fully amortized inference and account for the variation of PSDs from event to event.
The ASDDataset
class stores a set of ASD samples for several detectors, allowing for sampling during training.
As with the noise realizations, a random ASD is chosen from the dataset when preparing each sample during training. This augments the training set compared to fixing the noise ASD for each sample prior to training.
Similarly to the WaveformDataset
, the ASDDataset
is just a container. Dingo includes routines for building such a dataset from observational data.
Generating an ASDDataset
dingo_generate_asd_dataset
The basic approach is as follows:
Identify stretches of data within an observing run meeting certain criteria (sufficiently long, without events, and sufficiently high quality, …) or take-in user-specified stretches.
Fetch data corresponding to these stretches using either
channels, optionally specified in the settings file.
Estimate ASDs using Welch’s method on these stretches.
Save the collection of ASDs.
usage: dingo_generate_asd_dataset [-h] --data_dir DATA_DIR [--settings_file SETTINGS_FILE] [--time_segments_file TIME_SEGMENTS_FILE] [--out_name OUT_NAME] [--verbose]
Generate an ASD dataset based on a settings file.
optional arguments:
-h, --help show this help message and exit
--data_dir DATA_DIR Path where the PSD data is to be stored. Must contain a 'settings.yaml' file.
--settings_file SETTINGS_FILE
Path to a settings file in case two different datasets are generated in the same directory
--time_segments_file TIME_SEGMENTS_FILE
Optional file containing a dictionary of a list of time segments that should be used for estimating PSDs.This has to be a pickle file.
--out_name OUT_NAME Path to resulting ASD dataset
--verbose
where the settings file is of the form
dataset_settings:
f_min: 0
f_max: 2048
f_s: 4096
time_psd: 1024
T: 8
time_gap: 0
window:
roll_off: 0.4
type: tukey
num_psds_max: 20
channels:
H1: H1:DCS-CALIB_STRAIN_C02
L1: L1:DCS-CALIB_STRAIN_C02
detectors:
- H1
- L1
observing_run: O2
condor:
env_path: path/to/environment
num_jobs: 2 # per detector
num_cpus: 16
memory_cpus: 16000
Options correspond to the following:
f_min
,f_max
(optional)Lower and upper frequency range of the ASDs. Defaults to 0 and
f_s/2
, respectively.- Sampling rate
f_s
(Hz) This should be at least twice the value of
f_max
expected to be used.- Data length
time_psd
(s) The entire length of data from which to estimate a PSD using Welch’s method. Periodigrams are calculated on segments of this, and then averaged using the
median
method.- Segment length
T
(s) The length of each segment on which to take the DFT and calculate a periodigram.
- Gap
time_gap
(s) Gap between duration-
T
segments. E.g., iftime_psd=1024
,T=8
,time_gap=8
, then for each PSD, 64 periodigrams are computed, each using data stretches 8 s long, with gaps of 8 s between segments. Segments would then be \([0~\text{s}, 8~\text{s}], [16~\text{s}, 24~\text{s}], \ldots\).- Window function
Parameters of the window function used before taking DFT of data segments.
num_psds_max
(optional)If set, stop building the dataset after this number of PSDs have been estimated. This setting is useful for building a single-PSD dataset for pretraining a network.
- Channels (optional)
If set, data will be fetched from these channels, instead of using GWOSC.
- Detectors
Which detectors (H1, L1, V1, …) to include in the dataset.
- Observing run
Which observing run to use when estimating PSDs.
- Condor (optional)
Settings for HTCondor useful for parallelizing the ASD estimation across condor jobs.
dingo_generate_synthetic_asd_dataset
This method generates a dataset of synthetic ASDs from a dataset of existing ASDs to enhance robustness against ASD distribution shifts. In particular, this allows to generate a dataset of synthetic ASDs that are scaled by a fiducial ASD in order to adapt to a new observing run. This is particularly useful for training Dingo networks at the beginning of an observing run, when the number of training ASDs is limited. It also allows to generate smoother synthetic ASDs that more closely resemble those from BayesWave. The implementation follows the steps explained in this paper.
usage: dingo_generate_synthetic_asd_dataset [-h] --asd_dataset ASD_DATASET --settings_file SETTINGS_FILE [--num_processes NUM_PROCESSES] [--out_file OUT_FILE] [--verbose]
Generate a synthetic noise ASD dataset from an existing dataset of real ASDs.
optional arguments:
-h, --help show this help message and exit
--asd_dataset ASD_DATASET
Path to existing ASD dataset to be parameterized and re-sampled
--settings_file SETTINGS_FILE
YAML file containing database settings
--num_processes NUM_PROCESSES
Number of processes to use in pool for parallel parameterization
--out_file OUT_FILE Name of file for storing dataset.
--verbose
with a settings file of the form
parameterization_settings:
num_spline_positions: 30
num_spectral_segments: 400
sigma: 0.14
delta_f: -1
smoothen: True
sampling_settings:
bandwidth_spectral: 0.5
bandwidth_spline: 0.25
num_samples: 500
split_frequencies:
- 30
- 100
rescaling_psd_paths:
H1: /path/to/rescaling_asd_H1.hdf5
L1: /path/to/rescaling_asd_L1.hdf5
Options correspond to the following:
num_spline_positions
Number of nodes to use for the cubic spline interpolating the broad-band noise PSD.
num_spectral_segments
Maximum number of spectral lines to model.
sigma
Standard deviation of the Normal distribution parameterizing \(p(\log S_n|z)\).
delta_f
If > 0, truncates each spectral line.
smoothen
Whether to save the smooth ASDs (True) or the noisy ASDs (False). The noisy synthetic ASDs resemble real ASDs estimated with Welch’s method more closely, while the smooth ASDs are more similar to ASDs generated with BayesWave. (Default: False)
bandwidth_spectral, bandwidth_spline
Bandwidths for the KDEs modeling the distribution over spectral lines and broad-band noise, respectively. These determine the width of the resulting distribution.
num_samples
Number of synthetic ASDs to generate.
split_frequencies
(Set of) frequencies at dividing the broad-band noise into independent segments, e.g. due to different dominant noise sources (shot noise, seismic noise, etc.).
rescaling_psd_paths
Paths to ASD datasets for each detector to which the synthetic ASDs should be rescaled, e.g. the PSDs from the target observing run. If the dataset contains multiple ASDs, we use the first one. (Optional; if not provided, no rescaling will be done.)
Data conditioning
Importantly, the variance of white noise in each frequency bin is not 1, but rather
where \(\delta f\) is the frequency resolution and \(w\) is a “window factor”.
The denominator in the noise variance is seen to arise most easily in the noise-weighted inner product,
The window factor comes in because a window must be applied to time series data prior to taking the FFT. The windowing is assumed to reduce the power in the noise, but not affect the signal (which is localized away from the edge of the data segment). To simulate this, we add noise with variance scaled by the window factor.
The noise standard deviation is stored in the property FrequencyDomain.noise_std
. The window factor is calculated from the data conditioning settings specified in the train settings file.
Neural network architecture
Dingo is based on a method called Neural posterior estimation, see here for an introduction. A central object is the conditional neural density estimator, a deep neural network trained to represent the Bayesian posterior. This section describes the neural network architecture developed in [3], and subsequently used in [4], [5] and [6]. Note that Dingo can easily be extended to different architectures.
Neural spline flow with SVD compression
The architecture consists of two compenents, the embedding network which compresses the high-dimensionl data to a lower dimensional feature vector, and the conditional normalizing flow which estimates the Bayesian posterior based on this feature vector. Both components are trained jointly and end-to-end with the objective descriped here. The network can be build with
from dingo.core.nn.nsf import create_nsf_with_rb_projection_embedding_net
Embedding network
The embedding network compresses the high-dimensional conditioning information (consisting of frequency domain strain and PSD data). The first layer of this network is initialized with an SVD matrix from a reduced basis built with non-noisy waveforms. This projection filters out the noise that is orthogonal to the signal manifold, and significantly simplifies the task for the neural network.
The initial compression layer is followed by a sequence of residual blocks consisting of dense layers for further compression. Example kwargs:
embedding_net_kwargs = {
"input_dims": (2, 3, 8033),
"output_dim": 128,
"hidden_dims": [
1024, 1024, 1024, 1024, 1024, 1024, \
512, 512, 512, 512, 512, 512, \
256, 256, 256, 256, 256, 256, \
128, 128, 128, 128, 128, 128
],
"activation": "elu",
"dropout": 0.0,
"batch_norm": True,
"svd": {
"num_training_samples": 50000,
"num_validation_samples": 5000,
"size": 200,
}
}
Here, input_dims=(2, 3, 8033)
refers to the input dimension, for frequency domain data with 8033
frequency bins and 3
channels (real part, complex part, ASD) in 2
detectors. The embedding network compresses this to output_dim=128
components. The SVD initialization is controlled with the svd
argument, and the residual blocks are specified with hidden_dims
.
Note
Not all of these arguments have to be set in the configuration file when training dingo. For example, the input_dims
argument is automatically filled in based on the specified domain information and number of detectors. Similarly, the context_dim
of the flow (see below) is filled in based on the output_dim
of the embedding network and the number of GNPE proxies. See the Dingo examples for the corresponding configuration files and training commands.
Flow
We use the neural spline flow as a density estimator. This takes the output of the embedding network as context information and estimates the Bayesian posterior distribution. Example kwargs:
nsf_kwargs = {
"input_dim": 15,
"context_dim": 129,
"num_flow_steps": 30,
"base_transform_kwargs": {
"hidden_dim": 512,
"num_transform_blocks": 5,
"activation": "elu",
"dropout_probability": 0.0,
"batch_norm": True,
"num_bins": 8,
"base_transform_type": "rq-coupling",
},
}
This creates a neural spline flow with input_dim=15
parameters, conditioned on a 129
dimensional context vector, corresponding to the 128
dimensional output of the embedding network and one GNPE proxy variable. The neural spline flow consists of num_flow_steps=30
layers, for which the transformation is specified with base_transform_kwargs
.
nde = create_nsf_with_rb_projection_embedding_net(nsf_kwargs, embedding_net_kwargs)
Training
Training a network can require a significant amount of time (for production models, typically a week with a fast GPU). We therefore expect that this will almost always be done non-interactively using a command-line script. Dingo offers two options, dingo_train
and dingo_train_condor
, depending on whether your GPU is local or cluster-based.
Both of these scripts take as main argument a settings file, which specifies options relating to Data pre-processing, training strategy, Neural network architecture, hardware, and checkpointing. They produce a trained model in PyTorch .pt
format, and they save checkpoints and the training history. The settings file is furthermore saved within the model files for reproducibility and to be able to resume training from a checkpoint. Finally, all precursor settings files (for the waveform or noise datasets) are also saved with the model.
Settings file
train_settings.yaml
file. This is also available in the examples/ folder. The specific settings listed will train a production-size network, taking about a week on an NVIDIA A100. Consider reducing some model hyperparameters for experimentation.data:
waveform_dataset_path: /path/to/waveform_dataset.hdf5 # Contains intrinsic waveforms
train_fraction: 0.95
window:
type: tukey
f_s: 4096
T: 8.0
roll_off: 0.4
domain_update:
f_min: 20.0
f_max: 1024.0
svd_size_update: 200
detectors:
- H1
- L1
extrinsic_prior:
dec: default
ra: default
geocent_time: bilby.core.prior.Uniform(minimum=-0.10, maximum=0.10)
psi: default
luminosity_distance: bilby.core.prior.Uniform(minimum=100.0, maximum=1000.0)
ref_time: 1126259462.391
gnpe_time_shifts:
kernel: bilby.core.prior.Uniform(minimum=-0.001, maximum=0.001)
exact_equiv: True
inference_parameters: default
model:
type: nsf+embedding
nsf_kwargs:
num_flow_steps: 30
base_transform_kwargs:
hidden_dim: 512
num_transform_blocks: 5
activation: elu
dropout_probability: 0.0
batch_norm: True
num_bins: 8
base_transform_type: rq-coupling
embedding_net_kwargs:
output_dim: 128
hidden_dims: [1024, 1024, 1024, 1024, 1024, 1024,
512, 512, 512, 512, 512, 512,
256, 256, 256, 256, 256, 256,
128, 128, 128, 128, 128, 128]
activation: elu
dropout: 0.0
batch_norm: True
svd:
num_training_samples: 20000
num_validation_samples: 5000
size: 200
# Training is divided in stages. They each require all settings as indicated below.
training:
stage_0:
epochs: 300
asd_dataset_path: /path/to/asds_fiducial.hdf5
freeze_rb_layer: True
optimizer:
type: adam
lr: 0.0001
scheduler:
type: cosine
T_max: 300
batch_size: 64
stage_1:
epochs: 150
asd_dataset_path: /path/to/asds.hdf5
freeze_rb_layer: False
optimizer:
type: adam
lr: 0.00001
scheduler:
type: cosine
T_max: 150
batch_size: 64
# Local settings that have no impact on the final trained network.
local:
device: cpu # Change this to 'cuda' for training on a GPU.
num_workers: 6
# wandb:
# project: dingo
# group: O4
runtime_limits:
max_time_per_run: 36000
max_epochs_per_run: 500
checkpoint_epochs: 10
# condor:
# bid: 100
# num_cpus: 16
# memory_cpus: 128000
# num_gpus: 1
# memory_gpus: 8000
The train settings file is grouped into four sections:
data_settings
These settings point to a saved dataset of waveform polarizations and describe the transforms to obtain detector waveforms. A detailed description of these settings is available here.
model
This describes the model architecture, including network type and hyperparameters. All of these settings are described in the section on Neural network architecture.
training
This describes the training strategy. Training is divided into stages, each of which can differ to some extent. Stages are numbered (stage_0
, stage_1
, …) and executed in this order. Each stage is defined by the following settings:
- epochs
Total number of training epochs for the stage. The network sees the entire training set once per epoch.
- asd_dataset_path
Points to an
ASDDataset
file. Each stage can have its own ASD dataset, which is useful for implementing a pre-training stage with fixed ASD and a fine-tuning stage with variable ASD.- freeze_rb_layer
Whether to freeze the first layer of the embedding network in
nsf+embedding
models. This layer is seeded with reduced (SVD) basis vectors, so freezing this layer during pre-training simply projects data onto the basis coefficients. In the fine-tuning stage, when other weights are more stable, unfreezing this can be useful.- optimizer
Specify optimizer type and parameters such as initial learning rate.
- scheduler
Use a learning rate scheduler to reduce the learning rate over time. This can improve overall optimization.
- batch_size
Number of training samples per mini-batch. For a training dataset of size \(N\), then each epoch will consist of \(N / \text{batch_size}\) batches. Generally training will be faster for a larger batch size, but will require additional iterations.
Important
The stage-training framework allows for separate pre-training and fine-tuning stages. We found that having a pre-training stage where we freeze certain network weights and fix the noise ASD improves overall training results.
local
The local
settings are the only group that have no impact on the final trained network. Indeed, they are not even saved in the .pt
files; rather they are split off and saved in a new file local_settings.yaml
.
- device
cpu
orcuda
. Training on a GPU with CUDA is highly recommended.- num_workers
Number of CPU worker processes to use for pre-processing training data before copying to the GPU. Data pre-processing (inluding decompression, projection to detectors, and noise generation) is quite expensive, so using 16 or 32 processes is recommended, otherwise this can become a bottleneck. We recommend monitoring the GPU utilization percentage as well as time spent on pre-processing (output during training) to fine-tune this number.
- wandb
Settings for Weights & Biases. If you have an account, you can use this to track your training progress and compare different runs.
- runtime_limits
Maximum time (in seconds) or maximum number of epochs per run. Using this could make sense in a cluster environment.
- checkpoint_epochs
Dingo saves a temporary checkpoint in
model_latest.py
after every epoch, but this is later overwritten by the next checkpoint. This setting saves a permanent checkpoint after the specified number of epochs. Having these checkpoints can help in recovering from training failures that do not result in program termination.- condor
Settings for HTCondor. The condor script will (re)submit itself according to these options.
Command-line scripts
dingo_train
On a local machine, simply pass the settings file (or checkpoint) and an output directory to dingo_train
. It will train until complete, or until a runtime limit is reached.
usage: dingo_train [-h] [--settings_file SETTINGS_FILE] --train_dir TRAIN_DIR [--checkpoint CHECKPOINT]
Train a neural network for gravitational-wave single-event inference.
This program can be called in one of two ways:
a) with a settings file. This will create a new network based on the
contents of the settings file.
b) with a checkpoint file. This will resume training from the checkpoint.
optional arguments:
-h, --help show this help message and exit
--settings_file SETTINGS_FILE
YAML file containing training settings.
--train_dir TRAIN_DIR
Directory for Dingo training output.
--checkpoint CHECKPOINT
Checkpoint file from which to resume training.
dingo_train_condor
On a cluster using HTCondor, use dingo_train_condor
. This calls itself recursively as follows:
The first time you call it, use the flag
--start-submission
. This creates a condor submission filesubmission_file.sub
that again calls the executabledingo_train_condor
(now without the flag) and submits it. This will rundingo_train_condor
directly on the cluster node that is assigned.On the cluster node,
dingo_train_condor
first trains the network until done or a runtime limit is reached (be careful to set this shorter than the condor time limit). Then it creates a new submission file that once again callsdingo_train_condor
, and submits it. This will resume the run on a new node, and repeat.
usage: dingo_train_condor [-h] --train_dir TRAIN_DIR [--checkpoint CHECKPOINT] [--start_submission]
optional arguments:
-h, --help show this help message and exit
--train_dir TRAIN_DIR
Directory for Dingo training output.
--checkpoint CHECKPOINT
--start_submission
Output
Output from training is stored in the TRAIN_DIR
folder passed to the training scripts. This consists of the following:
model_latest.pt
checkpoints every epoch (overwritten);model_XXX.pt
checkpoints whereXXX
is the epoch number, everycheckpoint_epochs
epochs;model_stage_X.pt
at the end of training stageX
;history.txt
with columns (epoch number, train loss, test loss, learning rate);svd_L1.hdf5
, …, storing SVD basis information used for seeding the embedding network;local_settings.yaml
with local settings for the run (not stored with checkpoints).
The .pt
and .hdf5
files may all be inspected using dingo_ls
. This prints all the settings, as well as diagnostic information for SVD bases. The saved settings include all the settings provided in the settings file, as well as several derived quantities, such as parameter standardizations, additional context parameters (for GNPE), etc.
Modifying a checkpoint
Occasionally it may be necessary to change a setting of a partially trained model. For example, a model may have been successfully pre-trained, but the fine-tuning failed, and one may wish to change the fine-tuning settings without starting from scratch. Since the model setting are all stored with the checkpoint, they just need to be changed.
The script dingo_append_training_stage
allows for appending a model stage or replacing an existing planned stage. It will fail if the stage has already begun training, so be sure to use it on a sufficiently early checkpoint.
usage: dingo_append_training_stage [-h] --checkpoint CHECKPOINT --stage_settings_file STAGE_SETTINGS_FILE --out_file OUT_FILE [--replace REPLACE]
optional arguments:
-h, --help show this help message and exit
--checkpoint CHECKPOINT
--stage_settings_file STAGE_SETTINGS_FILE
--out_file OUT_FILE
--replace REPLACE
For more detailed adjustments to the training settings the script one can use the script compatibility/update_model_metadata.py
.
usage: update_model_metadata.py [-h] --checkpoint CHECKPOINT --key KEY [KEY ...] --value VALUE
optional arguments:
-h, --help show this help message and exit
--checkpoint CHECKPOINT
--key KEY [KEY ...]
--value VALUE
Warning
Modifications to model metadata can easily break things. Do not use this unless completely sure what you are doing!
Inference
With a trained network, inference can be performed on real data by executing following on the command line:
dingo_analyze_event
--model model.pt
--gps_time_event gps_time_event
--num_samples num_samples
--batch_size batch_size
This will download data from GWOSC at the specified time, apply the data conditioning consistent with the trained Dingo model and transform to frequency domain, and generate the requested number of posterior samples. It will save them in a file dingo_samples-gps_time_event.hdf5
, along with all settings used in upstream components of Dingo (the waveform dataset, noise dataset, and model training) and the data analyzed.
The dingo_analyze_event
script can also be used to analyze an injection.
The Sampler
class
Under the hood, the inference script uses the Sampler
class, or more specifically, the GWSampler
class, which inherits from it.
- class dingo.gw.inference.gw_samplers.GWSampler(**kwargs)
Bases:
GWSamplerMixin
,Sampler
Sampler for gravitational-wave inference using neural posterior estimation. Augments the base class by defining transform_pre and transform_post to prepare data for the inference network.
- transform_pre :
Whitens strain.
Repackages strain data and the inverse ASDs (suitably scaled) into a torch tensor.
- transform_post :
Extract the desired inference parameters from the network output ( array-like), de-standardize them, and repackage as a dict.
Also mixes in GW functionality for building the domain and correcting the reference time.
Allows for conditional and unconditional models, and draws samples from the model based on (optional) context data.
This is intended for use either as a standalone sampler, or as a sampler producing initial sample points for a GNPE sampler.
- Parameters:
kwargs – Keyword arguments that are forwarded to the superclass.
- property context
Data on which to condition the sampler. For injections, there should be a ‘parameters’ key with truth values.
- property event_metadata
Metadata for data analyzed. Can in principle influence any post-sampling parameter transformations (e.g., sky position correction), as well as the likelihood detector positions.
- log_prob(samples: DataFrame) ndarray
Calculate the model log probability at specific sample points.
- Parameters:
samples (pd.DataFrame) – Sample points at which to calculate the log probability.
- Return type:
np.array of log probabilities.
- run_sampler(num_samples: int, batch_size: int | None = None)
Generates samples and stores them in self.samples. Conditions the model on self.context if appropriate (i.e., if the model is not unconditional).
If possible, it also calculates the log_prob and saves it as a column in self.samples. When using GNPE it is not possible to obtain the log_prob due to the many Gibbs iterations. However, in the case of just one iteration, and when starting from a sampler for the proxy, the GNPESampler does calculate the log_prob.
Allows for batched sampling, e.g., if limited by GPU memory. Actual sampling for each batch is performed by _run_sampler(), which will differ for Sampler and GNPESampler.
- Parameters:
num_samples (int) – Number of samples requested.
batch_size (int, optional) – Batch size for sampler.
This is instantiated based on a PosteriorModel
. To draw samples, the context
property must first be set to the data to be analyzed. For gravitational waves this should be a dictionary with the following keys:
- waveform
(unwhitened) strain data in each detector
- asds
noise ASDs estimated in each detector at the time of the event
- parameters (optional)
for injections, the true parameters of the signal (for saving; ignored for sampling)
Once this is set, the run_sampler()
method draws the requested samples from the posterior conditioned on the context. It applies some post-processing (to de-standardize the data, and to correct for the rotation of the Earth between the network reference time and the event time), and then stores the result as a DataFrame in GWSampler.samples
. The DataFrame contains columns for each inference parameter, as well as the log probability of the sample under the posterior model.
The GWSampler.metadata
attribute contains all settings that went into producing the samples, including training datasets, network training settings, event metadata (for real events) and possible injection parameters. Finally, the to_samples_dataset()
method returns a SamplesDataset
containing all results, including the samples, settings, and context. This can be saved easily as HDF5.
Injections
Injections (i.e., simulated data) are produced using the Injection
class. It includes options for fixed or random parameters (drawn from a prior), and it returns injections in a format that can be directly set as GWSampler.context
.
- class dingo.gw.injection.Injection(prior, **gwsignal_kwargs)
Bases:
GWSignal
Produces injections of signals (with random or specified parameters) into stationary Gaussian noise. Output is not whitened.
- Parameters:
prior (PriorDict) – Prior used for sampling random parameters.
gwsignal_kwargs – Arguments to be passed to GWSignal base class.
- classmethod from_posterior_model_metadata(metadata)
Instantiate an Injection based on a posterior model. The prior, waveform settings, etc., will all be consistent with what the model was trained with.
- Parameters:
metadata (dict) – Dict which you can get via PosteriorModel.metadata
- injection(theta)
Generate an injection based on specified parameters.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Parameters:
theta (dict) – Parameters used for injection.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
- random_injection()
Generate a random injection.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
Hint
The convenience class method from_posterior_model_metadata()
instantiates an Injection
with all of the settings that went into the posterior model. To this class pass the PosteriorModel.metadata dictionary. It should produce injections that perfectly match the characteristics of the training data (waveform approximant, data conditioning, noise characteristics, etc.). This can be very useful for testing a trained model.
Important
Repeated calls to Injection.injection()
, even with the same parameters, will produce injections with different noise realizations (which therefore lead to different posteriors). For repeated analyses of the exact same injection (e.g., with different models or codes) it is necessary to either save the injection for re-use or fix a random seed.
GNPE
GNPE (Gibbs- or Group-Equivariant Neural Posterior Estimation) is an algorithm that can generate significantly improved results by incorporating known physical symmetries into NPE.[1] The aim is to simplify the data seen by the network by using the symmetries to transform certain parameters to “standardized” values. This simplifies the learning task of the network. At inference time, the standardizing transform is initially unknown, so we use Gibbs sampling to simultaneously learn the transform (along with the rest of the parameters) and apply it to simplify the data.
For gravitational waves, we use GNPE to standardize the times of arrival of the signal in the individual interferometers. (This corresponds to translations of the time of arrival at geocenter, and approximate sky rotations.) In frequency domain, time translations correspond to multiplication of the data by \(e^{-2\pi i f \Delta t}\), and a standard NPE network would have to learn to interpret such transformations consistent with the prior from the data. We found this to be a challenging learning task, which limited inference performance on the other parameters. Instead, GNPE leverages our knowledge of the time translations to build a network that is only required to interpret a much narrower window of arrival times.
We now provide a brief description of the GNPE method. Readers more interested in getting started with GNPE may skip to Usage below.
Description of method
GNPE allows us to incorporate knowledge of joint symmetries of data and parameters. That is, if a parameter (e.g., coalescence time) is transformed by a certain amount (\(\Delta t\)), then there is a corresponding transformation of the data (multiplication by \(e^{-2\pi i f \Delta t}\)) such that the transformed data is equally likely to occur under the transformed parameter,
It is based on two ideas:
Gibbs + NPE
Gibbs sampling is an algorithm for obtaining samples from a joint distribution \(p(x, y)\) if we are able to sample directly from each of the conditionals, \(p(x|y)\) and \(p(y|x)\). Starting from some point \(y_0\), we construct a Markov chain \(\{(x_i, y_i)\}\) by sampling
\(x_i \sim p(x_i | y_{i-1})\),
\(y_i \sim p(y_i | x_i)\),
and repeating until the chain is converged. The stationary distribution of the Markov chain is then \(p(x, y)\).

Illustration of Gibbs sampling for a distribution \(p(x, y)\).
Gibbs sampling can be combined with NPE by first introducing blurred “proxy” versions of a subset of parameters, which we denote \(\hat\theta\) i.e., \(\hat\theta \sim p(\hat\theta | \theta)\) where \(p(\hat\theta | \theta)\) is defined by a blurring kernel. For example, for GWs we take \(\hat t_I = t_I + \epsilon_I\), where \(\epsilon_I \sim \text{Unif}(-1~\text{ms}, 1~\text{ms})\). We then train a network to model the posterior, but now conditioned also on \(\hat \theta\), i.e., \(p(\theta | d, \hat\theta)\). We can then apply Gibbs sampling to obtain samples from the joint distribution \(p(\theta, \hat \theta | d)\), since we are able to sample individually from the conditional distributions:
We can sample from \(p(\hat\theta | \theta)\) since we defined the blurring kernel.
We can sample from \(p(\theta | d, \hat\theta)\) since we are modeling it using NPE.
Finally, we can drop \(\hat \theta\) from the samples to obtain the desired posterior samples.
The trick now is that since \(p(\theta | d, \hat\theta)\) is conditional on \(\hat \theta\), we can apply any \(\hat\theta\)-dependent transformation to \(d\). Returning to the time translations, \(\hat t_I\) is a good approximation to \(t_I\), so we apply the inverse time shift \(d_I \to d_I\cdot e^{2 \pi i f \hat t_I}\), which brings \(d_I\) into a close approximation to having coalescence time \(0\) in each detector. This means that the network never sees any data with merger time further than \(1~\text{ms}\) from \(0\), greatly simplifying the learning task.
In practice, we generate many Monte Carlo chains in parallel—one for each desired sample and with different starting points—and keep only the final sample from each chain—rather than generating one long chain. Each individual chain in this ensemble is unlikely to converge, but if the individual chains are initialized from a distribution sufficiently close to \(p(\hat \theta | d)\) then the collection of final samples from each chain should be a good approximation to samples from \(p(\theta, \hat\theta|d)\).
Group-equivariant NPE
So far we have described how Gibbs sampling together with NPE can simplify data by allowing any \(\hat\theta\)-dependent transformation of \(d\), simplifying the data distribution. If we know the data and parameters to be equivariant under a particular transformation, however, we can go a step further and enforce this exactly. To do so, we simply drop the dependence of the neural density estimator on \(\hat\theta\).
For gravitational waves, the overall time translation symmetry (in each detector) of the time of coalescence at geocenter is an exact symmetry, so we fully enforce this. The sky rotation, however, corresponds to an approximate symmetry: it shifts the time of coalescence in each detector, but a subleading effect is to change angle of incidence on a detector and hence the combination of polarizations observed. For this latter symmetry, we simply do not drop the proxy dependence.
Tip
GNPE is a generic method to incorporate symmetries into NPE:
Any symmetry (exact or approximate) connecting data and parameters
Any architecture, as it just requires (at most) conditioning on the proxy variables
As far as we are aware, GNPE is the only way to incorporate symmetries connecting data and parameters into architectures such as normalizing flows.
Usage
Training
To use GNPE for GW inference one must train two Dingo models:
An initialization network modeling \(p(t_I | d)\). This gives the initial guess of the proxy variables for the staring point of the Gibbs sampler. Since this is only modeling two or three parameters and it does not need to give perfect results, this network can also be much smaller than typical Dingo networks.
For an HL detector network, to infer just the detector coalescence times, set this in the train configuration.
data: inference_parameters: [H1_time, L1_time]
A main “GNPE” network, conditional on the proxy variables, \(p(\theta | d, \hat t_I)\). Implicitly in this expression, the data are transformed by the proxies, and the exact time-translation symmetry is enforced.
To condition this network on the correct proxies, we configure it to use GNPE in the settings file.
data: gnpe_time_shifts: kernel: bilby.core.prior.Uniform(minimum=-0.001, maximum=0.001) exact_equiv: True
This sets the blurring kernel to be \(\text{Unif}(-1~\text{ms}, 1~\text{ms})\) for all \(\hat t_I\), and it specifies to enforce the overall time of coalescence symmetry exactly. Dingo will determine automatically from the
detectors
setting which proxy variables to condition on.
Complete example config files for both networks are provided in the /examples folder.
Inference
The inference script must be pointed to both trained networks in order to sample using GNPE.
dingo_analyze_event
--model model
--model_init model_init
--gps_time_event gps_time_event
--num_samples num_samples
--num_gnpe_iterations num_gnpe_iterations
--batch_size batch_size
The number of Gibbs iterations is also specified here (typically 30 is appropriate). This script will save the final samples from each Gibbs chain.
The GNPESampler
class
The inference script above uses the GWSamplerGNPE
class, which is based on GNPESampler
,
- class dingo.core.samplers.GNPESampler(model: PosteriorModel, init_sampler: Sampler, num_iterations: int = 1)
Bases:
Sampler
Base class for GNPE sampler. It wraps a PosteriorModel and a standard Sampler for initialization. The former is used to generate initial samples for Gibbs sampling.
A GNPE network is conditioned on additional “proxy” context theta^, i.e.,
p(theta | theta^, d)
The theta^ depend on theta via a fixed kernel p(theta^ | theta). Combining these known distributions, this class uses Gibbs sampling to draw samples from the joint distribution,
p(theta, theta^ | d)
The advantage of this approach is that we are allowed to perform any transformation of d that depends on theta^. In particular, we can use this freedom to simplify the data, e.g., by aligning data to have merger times = 0 in each detector. The merger times are unknown quantities that must be inferred jointly with all other parameters, and GNPE provides a means to do this iteratively. See https://arxiv.org/abs/2111.13139 for additional details.
Gibbs sampling breaks access to the probability density, so this must be recovered through other means. One way is to train an unconditional flow to represent p(theta^ | d) for fixed d based on the samples produced through the GNPE Gibbs sampling. Starting from these, a single Gibbs iteration gives theta from the GNPE network, along with the probability density in the joint space. This is implemented in GNPESampler provided the init_sampler provides proxies directly and num_iterations = 1.
Attributes (beyond those of Sampler)
- init_samplerSampler
Used for providing initial samples for Gibbs sampling.
- num_iterationsint
Number of Gibbs iterations to perform.
iteration_tracker : IterationTracker not set up remove_init_outliers : float not set up
- param model:
- type model:
PosteriorModel
- param init_sampler:
Used for generating initial samples
- type init_sampler:
Sampler
- param num_iterations:
Number of GNPE iterations to be performed by sampler.
- type num_iterations:
int
- property context
Data on which to condition the sampler. For injections, there should be a ‘parameters’ key with truth values.
- property event_metadata
Metadata for data analyzed. Can in principle influence any post-sampling parameter transformations (e.g., sky position correction), as well as the likelihood detector positions.
- log_prob(samples: DataFrame) ndarray
Calculate the model log probability at specific sample points.
- Parameters:
samples (pd.DataFrame) – Sample points at which to calculate the log probability.
- Return type:
np.array of log probabilities.
- property num_iterations
The number of GNPE iterations to perform when sampling.
- run_sampler(num_samples: int, batch_size: int | None = None)
Generates samples and stores them in self.samples. Conditions the model on self.context if appropriate (i.e., if the model is not unconditional).
If possible, it also calculates the log_prob and saves it as a column in self.samples. When using GNPE it is not possible to obtain the log_prob due to the many Gibbs iterations. However, in the case of just one iteration, and when starting from a sampler for the proxy, the GNPESampler does calculate the log_prob.
Allows for batched sampling, e.g., if limited by GPU memory. Actual sampling for each batch is performed by _run_sampler(), which will differ for Sampler and GNPESampler.
- Parameters:
num_samples (int) – Number of samples requested.
batch_size (int, optional) – Batch size for sampler.
In addition to storing a PosteriorModel
, a GNPESampler
also stores a second Sampler
instance, which is based on the initialization network. When run_sampler()
is called, it first generates samples from the initialization network, perturbs them with the kernel to obtain proxy samples, and then performs num_iterations
Gibbs steps to obtain the final samples.
The Result class
The Result
class stores the output of a Sampler
run, namely a collection of samples. It contains several methods for operating on the samples, including for importance sampling, plotting, and density recovery:
- class dingo.gw.result.Result(**kwargs)
Bases:
Result
A dataset class to hold a collection of gravitational-wave parameter samples and perform various operations on them.
Compared to the base class, this class implements the domain, prior, and likelihood. It also includes a method for sampling the binary reference phase parameter based on the other parameters and the likelihood.
- Attributes:
- samplespd.Dataframe
Contains parameter samples, as well as (possibly) log_prob, log_likelihood, weights, log_prior, delta_log_prob_target.
- domainDomain
The domain of the data (e.g., FrequencyDomain), needed for calculating likelihoods.
- priorPriorDict
The prior distribution, used for importance sampling.
- likelihoodLikelihood
The Likelihood object, needed for importance sampling.
- contextdict
Context data from which the samples were produced (e.g., strain data, ASDs).
- metadatadict
Metadata inherited from the Sampler object. This describes the neural networks and sampling settings used.
- event_metadatadict
Metadata for the event analyzed, including time, data conditioning, channel, and detector information.
- log_evidencefloat
Calculated log(evidence) after importance sampling.
- log_evidence_stdfloat (property)
Standard deviation of the log(evidence)
- effective_sample_size, n_efffloat (property)
Number of effective samples, (sum_i w_i)^2 / sum_i w_i^2
- sample_efficiencyfloat (property)
Number of effective samples / Number of samples
- synthetic_phase_kwargsdict
kwargs describing the synthetic phase sampling.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- get_samples_bilby_phase()
Convert the spin angles phi_jl and theta_jn to account for a difference in phase definition compared to Bilby.
- Returns:
Samples
- Return type:
pd.DataFrame
- importance_sample(num_processes: int = 1, **likelihood_kwargs)
Calculate importance weights for samples.
Importance sampling starts with samples have been generated from a proposal distribution q(theta), in this case a neural network model. Certain networks (i.e., non-GNPE) also provide the log probability of each sample, which is required for importance sampling.
Given the proposal, we re-weight samples according to the (un-normalized) target distribution, which we take to be the likelihood L(theta) times the prior pi(theta). This gives sample weights
w(theta) ~ pi(theta) L(theta) / q(theta),
where the overall normalization does not matter (and we take to have mean 1). Since q(theta) enters this expression, importance sampling is only possible when we know the log probability of each sample.
As byproducts, this method also estimates the evidence and effective sample size of the importance sampled points.
This method modifies the samples pd.DataFrame in-place, adding new columns for log_likelihood, log_prior, and weights. It also stores the log_evidence as an attribute.
- Parameters:
num_processes (int) – Number of parallel processes to use when calculating likelihoods. (This is the most expensive task.)
likelihood_kwargs (dict) – kwargs that are forwarded to the likelihood constructor. E.g., options for marginalization.
- classmethod merge(parts)
Merge several Result instances into one. Check that they are compatible, in the sense of having the same metadata. Finally, calculate a new log evidence for the combined result.
This is useful when recombining separate importance sampling jobs.
- Parameters:
parts (list[Result]) – List of sub-Results to be combined.
- Return type:
Combined Result.
- parameter_subset(parameters)
Return a new object of the same type, with only a subset of parameters. Drops all other columns in samples DataFrame as well (e.g., log_prob, weights).
- Parameters:
parameters (list) – List of parameters to keep.
- Return type:
- property pesummary_prior
The prior in a form suitable for PESummary.
By convention, Dingo stores all times relative to a reference time, typically the trigger time for an event. The prior returned here corrects for that offset to be consistent with other codes.
- property pesummary_samples
Samples in a form suitable for PESummary.
These samples are adjusted to undo certain conventions used internally by Dingo:
Times are corrected by the reference time t_ref.
Samples are unweighted, using a fixed random seed for sampling importance
resampling. * The spin angles phi_jl and theta_jn are transformed to account for a difference in phase definition. * Some columns are dropped: delta_log_prob_target, log_prob
- plot_corner(parameters=None, filename='corner.pdf')
Generate a corner plot of the samples.
- Parameters:
parameters (list[str]) – List of parameters to include. If None, include all parameters. (Default: None)
filename (str) – Where to save samples.
- plot_log_probs(filename='log_probs.png')
Make a scatter plot of the target versus proposal log probabilities. For the target, subtract off the log evidence.
- plot_weights(filename='weights.png')
Make a scatter plot of samples weights vs log proposal.
- print_summary()
Display the number of samples, and (if importance sampling is complete) the log evidence and number of effective samples.
- reset_event(event_dataset)
Set the Result context and event_metadata based on an EventDataset.
If these attributes already exist, perform a comparison to check for changes. Update relevant objects appropriately. Note that setting context and event_metadata attributes directly would not perform these additional checks and updates.
- Parameters:
event_dataset (EventDataset) – New event to be used for importance sampling.
- sample_synthetic_phase(synthetic_phase_kwargs, inverse: bool = False)
Sample a synthetic phase for the waveform. This is a post-processing step applicable to samples theta in the full parameter space, except for the phase parameter (i.e., 14D samples). This step adds a phase parameter to the samples based on likelihood evaluations.
A synthetic phase is sampled as follows.
Compute and cache the modes for the waveform mu(theta, phase=0) for phase 0, organize them such that each contribution m transforms as exp(-i * m * phase).
Compute the likelihood on a phase grid, by computing mu(theta, phase) from the cached modes. In principle this likelihood is exact, however, it can deviate slightly from the likelihood computed without cached modes for various technical reasons (e.g., slightly different windowing of cached modes compared to full waveform when transforming TD waveform to FD). These small deviations can be fully accounted for by importance sampling. Note: when approximation_22_mode=True, the entire waveform is assumed to transform as exp(2i*phase), in which case the likelihood is only exact if the waveform is fully dominated by the (2, 2) mode.
Build a synthetic conditional phase distribution based on this grid. We use an interpolated prior distribution bilby.core.prior.Interped, such that we can sample and also evaluate the log_prob. We add a constant background with weight self.synthetic_phase_kwargs to the kde to make sure that we keep a mass-covering property. With this, the importance sampling will yield exact results even when the synthetic phase conditional is just an approximation.
Besides adding phase samples to self.samples[‘phase’], this method also modifies self.samples[‘log_prob’] by adding the log_prob of the synthetic phase conditional.
This method modifies self.samples in place.
- Parameters:
synthetic_phase_kwargs (dict) –
- This should consist of the kwargs
approximation_22_mode (optional) num_processes (optional) n_grid uniform_weight (optional)
inverse (bool, default False) – Whether to apply instead the inverse transformation. This is used prior to calculating the log_prob. In inverse mode, the posterior probability over phase is calculated for given samples. It is stored in self.samples[ ‘log_prob’].
- sampling_importance_resampling(num_samples=None, random_state=None)
Generate unweighted posterior samples from weighted ones. New samples are sampled with probability proportional to the sample weight. Resampling is done with replacement, until the desired number of unweighted samples is obtained.
- Parameters:
num_samples (int) – Number of samples to resample.
random_state (int or None) – Sampling seed.
- Returns:
Unweighted samples
- Return type:
pd.Dataframe
- split(num_parts)
Split the Result into a set of smaller results. The samples are evenly divided among the sub-results. Additional information (metadata, context, etc.) are copied into each.
This is useful for splitting expensive tasks such as importance sampling across multiple jobs.
- Parameters:
num_parts (int) – The number of parts to split the Result across.
- Return type:
list of sub-Results.
- train_unconditional_flow(parameters, nde_settings: dict, train_dir: str | None = None, threshold_std: float | None = inf)
Train an unconditional flow to represent the distribution of self.samples.
- Parameters:
parameters (list) – List of parameters over which to train the flow. Can be a subset of the existing parameters.
nde_settings (dict) – Configuration settings for the neural density estimator.
train_dir (Optional[str]) – Where to save the output of network training, e.g., logs, checkpoints. If not provide, a temporary directory is used.
threshold_std (Optional[float]) – Drop samples more than threshold_std standard deviations away from the mean (in any parameter) before training the flow. This is meant to remove outlier samples.
- Return type:
- update_prior(prior_update)
Update the prior based on a new dict of priors. Use the existing prior for parameters not included in the new dict.
If class samples have not been importance sampled, then save new sample weights based on the new prior. If class samples have been importance sampled, then update the weights.
- Parameters:
prior_update (dict) – Priors to update. This should be of the form {key : prior_str}, where str is a string that can instantiate a prior via PriorDict(prior_update). The prior_update is provided in this form so that it can be properly saved with the Result and later instantiated.
Following a sampler run, a Result
can be obtained using Sampler.to_result()
. Since Result
inherits from DingoDataset
it also possesses to_file()
and to_dictionary()
methods for saving samples and associated metadata (including context data, namely event data and ASDs).
Density recovery
When sampling with GNPE, there is no direct access to the probability density \(q(\theta|d)\). This is because of the Gibbs iterations: one only has access to the probability density of the entire chain, not just the final samples. The probability density is, however, needed for importance sampling, since this is the proposal distribution.
The Result
class contains methods to enable recovery of the probability density for a collection of samples. The approach is as follows:
Start from the samples \(\{(\theta_i, \hat\theta_i)\}_{i=1}^N\) from the final Gibbs iteration, including parameters \(\theta\) and proxy parameters \(\hat\theta\). By default these are included in the
samples
attribute generated by theSampler
.Train an unconditional density estimator \(q(\hat\theta)\) to model the proxy parameters. This is done by (1) using
parameter_subset()
to produce a newResult
containing just the proxies, and (2) usingtrain_unconditional_flow()
on this subset.Generate new samples \((\theta, \hat\theta) \sim q(\theta, \hat\theta | d) = q(\theta | d, \hat\theta) q(\hat\theta)\). This can be accomplished using
GNPESampler.sample()
withnum_iterations = 1
and setting the initial sampler to be the unconditional flow trained in the previous step. Since this does not involve multiple iterations, the density is obtained as well, so importance sampling can be performed.
Note
Density recovery can also be achieved using an unconditional density estimator for \(\theta\) (trained on samples \(\{\theta_i\}_{i=1}^N\) from GNPE). Since \(\theta\) typically comprises 14 parameters (versus 2 or 3 for \(\hat\theta\)) it is usually more straightforward to learn the proxies.
Synthetic phase
It is often challenging for Dingo to learn to model the phase
parameter \(\phi_c\). For this reason, we usually marginalize over it in training by excluding it from the list of inference_parameters
. The phase is, however, required for importance sampling unless using also a phase-marginalized likelihood (which is approximate except under special circumstances).
The Dingo gw.Result
class includes a method sample_synthetic_phase()
which produces a \(\phi_c\) sample from a \(\phi_c\)-marginalized sample. It does so by evaluating the likelihood on a \(\phi_c\)-grid and then sampling from the associated 1D distribution. The log_prob
value for the sample is also corrected to reflect the sampled \(\phi_c\). Speed is ensured by caching waveform modes and evaluating the polarizations for different \(\phi_c\). For further details, see the Supplemental Material of [5].
This method should be run after recovering the density, since in particular it applies a correction to the density.
Configuration
The method sample_synthetic_phase()
takes a kwargs argument. An example configuration is
approximation_22_mode: false
n_grid: 5001
uniform_weight: 0.01
num_processes: 100
- approximation_22_mode
Whether to make the approximation that only the \((l, m) = (2, 2)\) mode is present, i.e., waveforms transform as \(\exp{2\pi i \phi_c}\). This simplifies computations since it does not require caching of waveform modes.
- n_grid
Specifies the phase grid on which the likelihoods are evaluated.
- uniform_weight
Base probability level to add to ensure mass coverage.
- num_processes
For parallelization of synthetic phase sampling. This is usually the most expensive part of importance sampling, so it is advantageous to perform calculations in parallel.
Importance sampling
Once samples are in the right form—including all relevant parameters and the log probability—importance sampling is carried out using the importance_sample()
method. It allows to specify options for using a marginalized likelihood. (Time and phase marginalization are separately supported; see the documentation of dingo.gw.likelihood.StationaryGaussianGWLikelihood
.)
As with the synthetic phase, importance sampling allows for parallelization.
Plotting
The plotting methods included here are intended for quick plots for evaluating results. They include
corner plots comparing importance sampled and proposal results;
weights plots to evaluate performance of importance sampling; and
log probability plots comparing target and proposal log probability.
dingo_pipe
Dingo includes a command-line tool dingo_pipe for automating inference tasks. This is based very closely on the bilby_pipe package, with suitable modifications. The basic usage is to pass a .ini
file containing event information and run configuration settings, e.g.,
dingo_pipe GW150914.ini
dingo_pipe then executes various commands for preparing data, sampling from networks, importance sampling, and plotting. It can execute commands locally or on a cluster using a DAG. This documentation will only describe the relevant differences compared to bilby_pipe, and we refer the reader to the bilby_pipe documentation for additional information.
GW150914.ini
file. This is also available in the examples/ directory.################################################################################
## Job submission arguments
################################################################################
local = True
accounting = dingo
request-cpus-importance-sampling = 16
n-parallel = 4
sampling-requirements = [TARGET.CUDAGlobalMemoryMb>40000]
extra-lines = [+WantsGPUNode = True]
simple-submission = false
################################################################################
## Sampler arguments
################################################################################
model-init = /data/sgreen/dingo-experiments/XPHM/O1_init/model_stage_1.pt
model = /data/sgreen/dingo-experiments/XPHM/testing_inference/model.pt
device = 'cuda'
num-gnpe-iterations = 30
num-samples = 50000
batch-size = 50000
recover-log-prob = true
importance-sample = true
prior-dict = {
luminosity_distance = bilby.gw.prior.UniformComovingVolume(minimum=100, maximum=2000, name='luminosity_distance'),
}
################################################################################
## Data generation arguments
################################################################################
trigger-time = GW150914
label = GW150914
outdir = outdir_GW150914
channel-dict = {H1:GWOSC, L1:GWOSC}
psd-length = 128
sampling-frequency = 2048.0
importance-sampling-updates = {'duration': 4.0}
################################################################################
## Calibration marginalization arguments
################################################################################
calibration-model = CubicSpline
spline-calibration-envelope-dict = {H1: GWTC1_GW150914_H_CalEnv.txt, L1: GWTC1_GW150914_L_CalEnv.txt}
spline-calibration-nodes = 10
spline-calibration-curves = 1000
################################################################################
## Plotting arguments
################################################################################
plot-corner = true
plot-weights = true
plot-log-probs = true
The main difference compared to a bilby_pipe .ini
file is that one specifies trained Dingo models rather than data conditioning and prior settings. The reason for this is that such settings have already been incorporated into training of the model. It is therefore not possible to change them when sampling from the Dingo model. Understandably, this could cause inconvenience if one is interested in a different prior or data conditioning settings. As a solution, Dingo enables the changing of such settings during importance sampling, which applies the new settings for likelihood evaluations.
Important
For dingo_pipe it is necessary to specify a trained Dingo model instead of sampler settings such as prior and data conditioning.
Data generation
The first step is to download and prepare gravitational-wave data. In the example, dingo_pipe (using bilby_pipe routines) downloads the event and PSD data at the time of GW150914. It then prepares the data based on conditioning settings in the specified Dingo model. If other conflicting conditioning settings are provided (e.g., sampling_frequency = 2048.0
), dingo_pipe stores these in the dictionary importance_sampling_updates
(which can also be specified explicitly). These settings are ignored for now, and only applied later for calculating the likelihood in importance sampling.
The prepared event data and ASD are stored in a dingo.gw.data.event_dataset.EventDataset
, which is then saved to disk in HDF5 format.
Note
Dingo models are typically trained using Welch PSDs. For this reason we do not recommend using a BayesWave PSD for initial sampling. Rather, a BayesWave PSD should be specified within the importance_sampling_updates
dictionary, so that it will be used during importance sampling.
Sampling
The next step is sampling from the Dingo model. The model is loaded into a GWSampler or GWSamplerGNPE object. (If using GNPE it is necessary to specify a model-init
.) The Sampler context
is then set from the EventDataset prepared in the previous step. num-samples
samples are then generated in batches of size batch-size
. The samples (and context) are stored in a Result object and saved in HDF5 format.
If using GNPE, one can optionally specify num-gnpe-iterations
(it defaults to 30). Importantly, obtaining the log probability when using GNPE requires an extra step of training an unconditional flow. This is done using the recover-log-prob
flag, which defaults to True
. The default density recovery settings can be overwritten by providing a density-recovery-settings
dictionary in the .ini
file.
Since sampling uses GPU hardware, there is an additional key sampling-requirements
for HTCondor requirements during the sampling stage. This is intended for specifying GPU requirements such as memory or CUDA version.
Importance sampling
For importance sampling, the Result saved in the previous step is loaded. Since this contains the strain data and ASDs, as well as all settings used for training the network, the likelihood and prior can be evaluated for each sample point. If it is necessary to change data conditioning or PSD for importance sampling (i.e., if the importance-sampling-updates
dictionary is non-empty), then a second data generation step is first carried using the new settings, and used as importance sampling context. The importance sampled result is finally saved as HDF5, including the estimated Bayesian evidence.
If a prior-dict
is specified in the .ini
file, then this will be used for the importance sampling prior. One example where this is useful is for the luminosity distance prior. Indeed, Dingo tends to train better using a uniform prior over luminosity distance, but physically one would prefer a uniform in volume prior. By specifying a prior-dict
this change can be made in importance sampling.
Caution
If extending the prior support during importance sampling, be sure that the posterior does not rail up against the prior boundary being extended.
By default, dingo_pipe assumes that it is necessary to sample the phase synthetically, so it will do so before importance sampling. This can be turned off by passing an empty dictionary to importance-sampling-settings
. Note that importance sampling itself can be switched off by setting the importance-sample
flag to False (it defaults to True).
Importance sampling (including synthetic phase sampling) is an expensive step, so dingo_pipe allows for parallelization: this step is split over n-parallel
jobs, each of which uses request-cpus-importance-sampling
processes. In the backend, this makes use of the Result split() and merge() methods.
Calibration marginalization
Settings related to calibration are used to marginalize over calibration uncertainty during importance sampling.
- calibration-model
None or “CubicSpline”. If “CubicSpline”, perform calibration marginalization using a cubic spline calibration model. If None do not perform calibration marginalization. (Default: None)
- spline-calibration-envelope-dict
Dictionary pointing to the spline calibration envelope files. This is required if calibration-model is “CubicSpline”.
- spline-calibration-nodes
Number of calibration nodes. (Default: 10)
- spline-calibration-curves
Number of calibration curves to use for marginalization. (Default: 1000)
Plotting
The standard Result plots are turned on using the plot-corner
, plot-weights
, and plot-log-probs
flags.
Additional options
- extra-lines
Additional lines for all submission scripts. This could be useful for particular cluster configurations.
- simple-submission
Strip the keys
accounting_tag
,getenv
,priority
, anduniverse
from submission scripts. Again useful for particular cluster configurations.
dingo
dingo package
Subpackages
dingo.asimov package
Submodules
dingo.asimov.asimov module
Module contents
dingo.core package
Subpackages
dingo.core.density package
Submodules
dingo.core.density.interpolation module
- dingo.core.density.interpolation.interpolated_log_prob(sample_points, values, evaluation_point)
Given a distribution discretized on a grid, return a sample and the log prob from an interpolated distribution. Wraps the bilby.core.prior.Interped class.
- Parameters:
sample_points (np.ndarray) – x values for samples
values (np.ndarray) – y values for samples. The distribution does not have to be initially normalized, although the final log_prob will be.
evaluation_point (float) – x value at which to evaluate log_prob.
- Returns:
float
- Return type:
log_prob
- dingo.core.density.interpolation.interpolated_log_prob_multi(sample_points, values, evaluation_points, num_processes: int = 1)
Given a distribution discretized on a grid, the log prob at a specific point using an interpolated distribution. Wraps the bilby.core.prior.Interped class. Works with multiprocessing.
- Parameters:
sample_points (np.ndarray, shape (N)) – x values for samples
values (np.ndarray, shape (B, N)) – y values for samples. The distributions do not have to be initially normalized, although the final log_probs will be. B = batch dimension.
evaluation_points (np.ndarray, shape (B)) – x values at which to evaluate log_prob.
num_processes (int) – Number of parallel processes to use.
- Returns:
(np.ndarray, np.ndarray)
- Return type:
sample and log_prob arrays, each of length B
- dingo.core.density.interpolation.interpolated_sample_and_log_prob(sample_points, values)
Given a distribution discretized on a grid, return a sample and the log prob from an interpolated distribution. Wraps the bilby.core.prior.Interped class.
- Parameters:
sample_points (np.ndarray) – x values for samples
values (np.ndarray) – y values for samples. The distribution does not have to be initially normalized, although the final log_prob will be.
- Returns:
(float, float)
- Return type:
sample and log_prob
- dingo.core.density.interpolation.interpolated_sample_and_log_prob_multi(sample_points, values, num_processes: int = 1)
Given a distribution discretized on a grid, return a sample and the log prob from an interpolated distribution. Wraps the bilby.core.prior.Interped class. Works with multiprocessing.
- Parameters:
sample_points (np.ndarray, shape (N)) – x values for samples
values (np.ndarray, shape (B, N)) – y values for samples. The distributions do not have to be initially normalized, although the final log_probs will be. B = batch dimension.
num_processes (int) – Number of parallel processes to use.
- Returns:
(np.ndarray, np.ndarray)
- Return type:
sample and log_prob arrays, each of length B
dingo.core.density.nde_settings module
Default settings for unconditional density estimation
- dingo.core.density.nde_settings.get_default_nde_settings_3d(device='cpu', num_workers=0, inference_parameters=None)
dingo.core.density.unconditional_density_estimation module
- class dingo.core.density.unconditional_density_estimation.SampleDataset(data)
Bases:
Dataset
Dataset class for unconditional density estimation. This is required, since the training method of dingo.core.models.PosteriorModel expects a tuple of (theta, *context) as output of the DataLoader, but here we have no context, so len(context) = 0. This SampleDataset therefore returns a tuple (theta, ) instead of just theta.
- dingo.core.density.unconditional_density_estimation.parse_args()
- dingo.core.density.unconditional_density_estimation.train_unconditional_density_estimator(result, settings: dict, train_dir: str)
Train unconditional density estimator for a given set of samples.
- Parameters:
samples (pd.DataFrame) – DataFrame containing the samples to train the density estimator on.
settings (dict) – Dictionary containing the settings for the density estimator.
train_dir (str) – Path to the directory where the trained model should be saved.
- Returns:
model – trained density estimator
- Return type:
Module contents
This submodule contains tools for density estimation from samples. This is required for instance to recover the posterior density from GNPE samples, since the density is intractable with GNPE.
dingo.core.models package
Submodules
dingo.core.models.posterior_model module
TODO: Docstring
- class dingo.core.models.posterior_model.PosteriorModel(model_filename: str | None = None, metadata: dict | None = None, initial_weights: dict | None = None, device: str = 'cuda', load_training_info: bool = True)
Bases:
object
TODO: Docstring
- initialize_model:
initialize the NDE (including embedding net) as posterior model
- initialize_training:
initialize for training, that includes storing the epoch, building an optimizer and a learning rate scheduler
- save_model:
save the model, including all information required to rebuild it, except for the builder function
- load_model:
load and build a model from a file
- train_model:
train the model
- inference:
perform inference
- Parameters:
model_builder (Callable) – builder function for the model, self.model = model_builder(**model_kwargs)
model_kwargs (dict = None) – kwargs for for the model, self.model = model_builder(**model_kwargs)
model_filename (str = None) – path to filename of loaded model
optimizer_kwargs (dict = None) – kwargs for optimizer
scheduler_kwargs (dict = None) – kwargs for scheduler
init_for_training (bool = False) – flag whether initialization for training (e.g., optimizer) required
metadata (dict = None) – dict with metadata, used to save dataset_settings and train_settings
- initialize_model()
Initialize a model for the posterior by calling the self.model_builder with self.model_kwargs.
- initialize_optimizer_and_scheduler()
Initializes the optimizer and scheduler with self.optimizer_kwargs and self.scheduler_kwargs, respectively.
- load_model(model_filename: str, load_training_info: bool = True, device: str = 'cuda')
Load a posterior model from the disk.
- Parameters:
model_filename (str) – path to saved model
load_training_info (bool #TODO: load information for training) – specifies whether information required to proceed with training is loaded, e.g. optimizer state dict
- model_to_device(device)
Put model to device, and set self.device accordingly.
- sample(*x, batch_size=None, get_log_prob=False)
Sample from posterior model, conditioned on context x. x is expected to have a batch dimension, i.e., to obtain N samples with additional context requires x = x_.expand(N, *x_.shape).
This method takes care of the batching, makes sure that self.model is in evaluation mode and disables gradient computation.
- Parameters:
*x – input context to the neural network; has potentially multiple elements for, e.g., gnpe proxies
batch_size (int = None) – batch size for sampling
get_log_prob (bool = False) – if True, also return log probability along with the samples
- Returns:
samples – samples from posterior model
- Return type:
torch.Tensor
- save_model(model_filename: str, save_training_info: bool = True)
Save the posterior model to the disk.
- Parameters:
model_filename (str) – filename for saving the model
save_training_info (bool) – specifies whether information required to proceed with training is saved, e.g. optimizer state dict
- train(train_loader: DataLoader, test_loader: DataLoader, train_dir: str, runtime_limits: object | None = None, checkpoint_epochs: int | None = None, use_wandb=False, test_only=False)
- Parameters:
train_loader –
test_loader –
train_dir –
runtime_limits –
checkpoint_epochs –
use_wandb –
test_only (bool = False) – if True, training is skipped
- dingo.core.models.posterior_model.get_model_callable(model_type: str)
- dingo.core.models.posterior_model.test_epoch(pm, dataloader)
- dingo.core.models.posterior_model.train_epoch(pm, dataloader)
Module contents
dingo.core.nn package
Submodules
dingo.core.nn.enets module
Implementation of embedding networks.
- class dingo.core.nn.enets.DenseResidualNet(input_dim: int, output_dim: int, hidden_dims: ~typing.Tuple, activation: ~typing.Callable = <function elu>, dropout: float = 0.0, batch_norm: bool = True)
Bases:
Module
A nn.Module consisting of a sequence of dense residual blocks. This is used to embed high dimensional input to a compressed output. Linear resizing layers are used for resizing the input and output to match the first and last hidden dimension, respectively.
Module specs
input dimension: (batch_size, input_dim) output dimension: (batch_size, output_dim)
- param input_dim:
dimension of the input to this module
- type input_dim:
int
- param output_dim:
output dimension of this module
- type output_dim:
int
- param hidden_dims:
tuple with dimensions of hidden layers of this module
- type hidden_dims:
tuple
- param activation:
activation function used in residual blocks
- type activation:
callable
- param dropout:
dropout probability for residual blocks used for reqularization
- type dropout:
float
- param batch_norm:
flag that specifies whether to use batch normalization
- type batch_norm:
bool
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class dingo.core.nn.enets.LinearProjectionRB(input_dims: List[int], n_rb: int, V_rb_list: Tuple | None)
Bases:
Module
A compression layer that reduces the input dimensionality via projection onto a reduced basis. The input data is of shape (batch_size, num_blocks, num_channels, num_bins). Each of the num_blocks blocks (for GW use case: block=detector) is treated independently.
A single block consists of 1D data with num_bins bins (e.g. GW use case: num_bins=number of frequency bins). It has num_channels>=2 different channels, channel 0 and 1 store the real and imaginary part of the signal. Channels with index >=2 are used for auxiliary signals (such as PSD for GW use case).
This layer compresses the complex signal in channels 0 and 1 to n_rb reduced-basis (rb) components. This is achieved by initializing the weights of this layer with the rb matrix V, such that the (2*n_rb) dimensional output of each block is the concatenation of the real and imaginary part of the reduced basis projection of the complex signal in channel 0 and 1. The projection of the auxiliary channels with index >=2 onto these components is initialized with 0.
Module specs
input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, 2 * n_rb * num_blocks)
- param input_dims:
dimensions of input batch, omitting batch dimension input_dims = [num_blocks, num_channels, num_bins]
- type input_dims:
list
- param n_rb:
number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks
- type n_rb:
int
- param V_rb_list:
tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model
- type V_rb_list:
tuple of np.arrays, or None
- forward(x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- init_layers(V_rb_list)
Loop through layers and initialize them individually with the corresponding rb projection. V_rb_list is a list that contains the rb matrix V for each block. Each matrix V in V_rb_list is represented with a numpy array of shape (self.num_bins, num_el), where num_el >= self.n_rb.
- property input_dim
- property output_dim
- test_dimensions(V_rb_list)
Test if input dimensions to this layer are consistent with each other, and the reduced basis matrices V.
- class dingo.core.nn.enets.ModuleMerger(module_list: Tuple)
Bases:
Module
This is a wrapper used to process multiple different kinds of context information collected in x = (x_0, x_1, …). For each kind of context information x_i, an individual embedding network is provided in enets = (enet_0, enet_1, …). The embedded output of the forward method is the concatenation of the individual embeddings enet_i(x_i).
In the GW use case, this wrapper can be used to embed the high-dimensional signal input into a lower dimensional feature vector with a large embedding network, while applying an identity embedding to the time shifts.
Module specs
input dimension: (batch_size, …), (batch_size, …), … output dimension: (batch_size, ?)
- param module_list:
nn.Modules for embedding networks, use torch.nn.Identity for identity mappings
- type module_list:
tuple
- forward(*x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- dingo.core.nn.enets.create_enet_with_projection_layer_and_dense_resnet(input_dims: List[int], V_rb_list: Tuple | None, output_dim: int, hidden_dims: Tuple, svd: dict, activation: str = 'elu', dropout: float = 0.0, batch_norm: bool = True, added_context: bool = False)
Builder function for 2-stage embedding network for 1D data with multiple blocks and channels. Module 1 is a linear layer initialized as the projection of the complex signal onto reduced basis components via the LinearProjectionRB, where the blocks are kept separate. See docstring of LinearProjectionRB for details. Module 2 is a sequence of dense residual layers, that is used to further reduce the dimensionality.
The projection requires the complex signal to be represented via the real part in channel 0 and the imaginary part in channel 1. Auxiliary signals may be contained in channels with indices => 2. In GW use case a block corresponds to a detector and channel 2 is used for ASD information.
If added_context = True, the 2-stage embedding network described above is merged with an identity mapping via ModuleMerger. Then, the expected input is not x with x.shape = (batch_size, num_blocks, num_channels, num_bins), but rather the tuple *(x, z), where z is additional context information. The output of the full module is then the concatenation of enet(x) and z. In GW use case, this is used to concatenate the applied time shifts z to the embedded feature vector of the strain data enet(x).
Module specs
- For added_context == False:
input dimension: (batch_size, num_blocks, num_channels, num_bins) output dimension: (batch_size, output_dim)
- For added_context == True:
- input dimension: (batch_size, num_blocks, num_channels, num_bins),
(batch_size, N)
output dimension: (batch_size, output_dim + N)
- param input_dims:
list dimensions of input batch, omitting batch dimension input_dims = (num_blocks, num_channels, num_bins)
- param n_rb:
int number of reduced basis elements used for projection the output dimension of the layer is 2 * n_rb * num_blocks
- param V_rb_list:
tuple of np.arrays, or None tuple with V matrices of the reduced basis SVD projection, convention for SVD matrix decomposition: U @ s @ V^h; if None, layer is not initialized with reduced basis projection, this is useful when loading a saved model
- param output_dim:
int output dimension of the full module
- param hidden_dims:
tuple tuple with dimensions of hidden layers of module 2
- param activation:
str str that specifies activation function used in residual blocks
- param dropout:
float dropout probability for residual blocks used for reqularization
- param batch_norm:
bool flag that specifies whether to use batch normalization
- param added_context:
bool if set to True, additional context z is concatenated to the embedded feature vector enet(x); note that in this case, the expected input is a tuple with 2 elements, input = (x, z) rather than just the tensor x.
- return:
nn.Module
dingo.core.nn.nsf module
Implementation of the neural spline flow (NSF). Most of this code is adapted from the uci.py example from https://github.com/bayesiains/nsf.
- class dingo.core.nn.nsf.FlowWrapper(flow: Flow, embedding_net: Module | None = None)
Bases:
Module
This class wraps the neural spline flow. It is required for multiple reasons. (i) some embedding networks take tuples as input, which is not supported by the nflows package. (ii) paralellization across multiple GPUs requires a forward method, but the relevant flow method for training is log_prob.
- Parameters:
flow – flows.base.Flow
embedding_net – nn.Module
- forward(y, *x)
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- log_prob(y, *x)
- sample(*x, num_samples=1)
- sample_and_log_prob(*x, num_samples=1)
- dingo.core.nn.nsf.autocomplete_model_kwargs_nsf(model_kwargs, data_sample)
Autocomplete the model kwargs from train_settings and data_sample from the dataloader: (*) set input dimension of embedding net to shape of data_sample[1] (*) set dimension of nsf parameter space to len(data_sample[0]) (*) set added_context flag of embedding net if required for gnpe proxies (*) set context dim of nsf to output dim of embedding net + gnpe proxy dim
- Parameters:
train_settings – dict train settings as loaded from .yaml file
data_sample – list Sample from dataloader (e.g., wfd[0]) used for autocomplection. Should be of format [parameters, GW data, gnpe_proxies], where the last element is only there is gnpe proxies are required.
- Returns:
model_kwargs: dict updated, autocompleted model_kwargs
- dingo.core.nn.nsf.create_base_transform(i: int, param_dim: int, context_dim: int | None = None, hidden_dim: int = 512, num_transform_blocks: int = 2, activation: str = 'relu', dropout_probability: float = 0.0, batch_norm: bool = False, num_bins: int = 8, tail_bound: float = 1.0, apply_unconditional_transform: bool = False, base_transform_type: str = 'rq-coupling')
Build a base NSF transform of y, conditioned on x.
This uses the PiecewiseRationalQuadraticCoupling transform or the MaskedPiecewiseRationalQuadraticAutoregressiveTransform, as described in the Neural Spline Flow paper (https://arxiv.org/abs/1906.04032).
Code is adapted from the uci.py example from https://github.com/bayesiains/nsf.
A coupling flow fixes half the components of y, and applies a transform to the remaining components, conditioned on the fixed components. This is a restricted form of an autoregressive transform, with a single split into fixed/transformed components.
The transform here is a neural spline flow, where the flow is parametrized by a residual neural network that depends on y_fixed and x. The residual network consists of a sequence of two-layer fully-connected blocks.
- Parameters:
i – int index of transform in sequence
param_dim – int dimensionality of y
context_dim – int = None dimensionality of x
hidden_dim – int = 512 number of hidden units per layer
num_transform_blocks – int = 2 number of transform blocks comprising the transform
activation – str = ‘relu’ activation function
dropout_probability – float = 0.0 dropout probability for regularization
batch_norm – bool = False whether to use batch normalization
num_bins – int = 8 number of bins for the spline
tail_bound – float = 1.
apply_unconditional_transform – bool = False whether to apply an unconditional transform to fixed components
base_transform_type – str = ‘rq-coupling’ type of base transform, one of {rq-coupling, rq-autoregressive}
- Returns:
Transform the NSF transform
- dingo.core.nn.nsf.create_linear_transform(param_dim: int)
Create the composite linear transform PLU.
- Parameters:
param_dim – int dimension of the parameter space
- Returns:
nde.Transform the linear transform PLU
- dingo.core.nn.nsf.create_nsf_model(input_dim: int, context_dim: int, num_flow_steps: int, base_transform_kwargs: dict, embedding_net_builder: Callable | str | None = None, embedding_net_kwargs: dict | None = None)
Build NSF model. This models the posterior distribution p(y|x).
- The model consists of
a base distribution (StandardNormal, dim(y))
a sequence of transforms, each conditioned on x
- Parameters:
input_dim – int, dimensionality of y
context_dim – int, dimensionality of the (embedded) context
num_flow_steps – int, number of sequential transforms
base_transform_kwargs – dict, hyperparameters for transform steps
embedding_net_builder – Callable=None, build function for embedding network TODO
embedding_net_kwargs – dict=None, hyperparameters for embedding network
- Returns:
Flow the NSF (posterior model)
- dingo.core.nn.nsf.create_nsf_with_rb_projection_embedding_net(nsf_kwargs: dict, embedding_net_kwargs: dict, initial_weights: dict | None = None)
Builds a neural spline flow with an embedding network that consists of a reduced basis projection followed by a residual network. Optionally initializes the embedding network weights.
- Parameters:
nsf_kwargs (dict) – kwargs for neural spline flow
embedding_net_kwargs (dict) – kwargs for emebedding network
initial_weights (dict) – Dictionary containing the initial weights for the SVD projection. This should have one key ‘V_rb_list’, with value a list of SVD V matrices (one for each detector).
- Returns:
Neural spline flow model
- Return type:
nn.Module
- dingo.core.nn.nsf.create_nsf_wrapped(**kwargs)
Wraps the NSF model in a FlowWrapper. This is required for parallel training, and wraps the log_prob method as a forward method.
- dingo.core.nn.nsf.create_transform(num_flow_steps: int, param_dim: int, context_dim: int, base_transform_kwargs: dict)
Build a sequence of NSF transforms, which maps parameters y into the base distribution u (noise). Transforms are conditioned on context data x.
Note that the forward map is f^{-1}(y, x).
- Each step in the sequence consists of
A linear transform of y, which in particular permutes components
A NSF transform of y, conditioned on x.
There is one final linear transform at the end.
- Parameters:
num_flow_steps – int, number of transforms in sequence
param_dim – int, dimensionality of parameter space (y)
context_dim – int, dimensionality of context (x)
base_transform_kwargs – int hyperparameters for NSF step
- Returns:
Transform the NSF transform sequence
Module contents
dingo.core.utils package
Submodules
dingo.core.utils.condor_utils module
- dingo.core.utils.condor_utils.copy_logfiles(log_dir, epoch, name='info', suffixes=('.err', '.log', '.out'))
- dingo.core.utils.condor_utils.copyfile(src, dst)
- dingo.core.utils.condor_utils.create_submission_file(train_dir, filename='submission_file.sub')
TODO: documentation :param train_dir: :param filename: :return:
- dingo.core.utils.condor_utils.create_submission_file_and_submit_job(train_dir, filename='submission_file.sub')
TODO: documentation :param train_dir: :param filename: :return:
- dingo.core.utils.condor_utils.resubmit_condor_job(train_dir, train_settings, epoch)
TODO: documentation :param train_dir: :param train_settings: :param epoch: :return:
dingo.core.utils.gnpeutils module
dingo.core.utils.logging_utils module
- dingo.core.utils.logging_utils.check_directory_exists_and_if_not_mkdir(directory, logger)
Checks if the given directory exists and creates it if it does not exist
- Parameters:
directory (str) – Name of the directory
bilby-pipe (Borrowed from) –
- dingo.core.utils.logging_utils.setup_logger(outdir=None, label=None, log_level='INFO')
Setup logging output: call at the start of the script to use
- Parameters:
outdir (str) – If supplied, write the logging output to outdir/label.log
label (str) – If supplied, write the logging output to outdir/label.log
log_level (str, optional) – [‘debug’, ‘info’, ‘warning’] Either a string from the list above, or an integer as specified in https://docs.python.org/2/library/logging.html#logging-levels
bilby-pipe (Borrowed from) –
dingo.core.utils.misc module
- dingo.core.utils.misc.get_version()
- dingo.core.utils.misc.recursive_check_dicts_are_equal(dict_a, dict_b)
dingo.core.utils.plotting module
- dingo.core.utils.plotting.plot_corner_multi(samples, weights=None, labels=None, filename='corner.pdf', **kwargs)
Generate a corner plot for multiple posteriors.
- Parameters:
samples (list[pd.DataFrame]) – List of sample sets. The DataFrame column names are used as parameter labels.
weights (list[np.ndarray or None] or None) – List of weights sets. The length of each array should be the same as the length of the corresponding samples.
labels (list[str or None] or None) – Labels for the posteriors.
filename (str) – Where to save samples.
**kwargs – Forwarded to corner.corner.
dingo.core.utils.pt_to_hdf5 module
- dingo.core.utils.pt_to_hdf5.main()
- dingo.core.utils.pt_to_hdf5.parse_args()
dingo.core.utils.torchutils module
- dingo.core.utils.torchutils.build_train_and_test_loaders(dataset: Dataset, train_fraction: float, batch_size: int, num_workers: int)
Split the dataset into train and test sets, and build corresponding DataLoaders. The random split uses a fixed seed for reproducibility.
- Parameters:
dataset (torch.utils.data.Dataset) –
train_fraction (float) – Fraction of dataset to use for training. The remainder is used for testing. Should lie between 0 and 1.
batch_size (int) –
num_workers (int) –
- Return type:
(train_loader, test_loader)
- dingo.core.utils.torchutils.fix_random_seeds(_)
Utility function to set random seeds when using multiple workers for DataLoader.
- dingo.core.utils.torchutils.forward_pass_with_unpacked_tuple(model: Module, x: Tuple | Tensor)
Performs forward pass of model with input x. If x is a tuple, it return y = model(*x), else it returns y = model(x). :param model: nn.Module
model for forward pass
- Parameters:
x – Union[Tuple, torch.Tensor] input for forward pass
- Returns:
torch.Tensor output of the forward pass, either model(*x) or model(x)
- dingo.core.utils.torchutils.get_activation_function_from_string(activation_name: str)
Returns an activation function, based on the name provided.
- Parameters:
activation_name – str name of the activation function, one of {‘elu’, ‘relu’, ‘leaky_rely’}
- Returns:
function corresponding activation function
- dingo.core.utils.torchutils.get_lr(optimizer)
Returns a list with the learning rates of the optimizer.
- dingo.core.utils.torchutils.get_number_of_model_parameters(model: Module, requires_grad_flags: tuple = (True, False))
Counts parameters of the module. The list requires_grad_flag can be used to specify whether all parameters should be counted, or only those with requires_grad = True or False. :param model: nn.Module
model
- Parameters:
requires_grad_flags – tuple tuple of bools, for requested requires_grad flags
- Returns:
number of parameters of the model with requested required_grad flags
- dingo.core.utils.torchutils.get_optimizer_from_kwargs(model_parameters: Iterable, **optimizer_kwargs)
Builds and returns an optimizer for model_parameters. The type of the optimizer is determined by kwarg type, the remaining kwargs are passed to the optimizer.
- Parameters:
model_parameters (Iterable) – iterable of parameters to optimize or dicts defining parameter groups
optimizer_kwargs – kwargs for optimizer; type needs to be one of [adagrad, adam, adamw, lbfgs, RMSprop, sgd], the remaining kwargs are used for specific optimizer kwargs, such as learning rate and momentum
- Return type:
optimizer
- dingo.core.utils.torchutils.get_scheduler_from_kwargs(optimizer: Optimizer, **scheduler_kwargs)
Builds and returns an scheduler for optimizer. The type of the scheduler is determined by kwarg type, the remaining kwargs are passed to the scheduler.
- Parameters:
optimizer (torch.optim.optimizer.Optimizer) – optimizer for which the scheduler is used
scheduler_kwargs – kwargs for scheduler; type needs to be one of [step, cosine, reduce_on_plateau], the remaining kwargs are used for specific scheduler kwargs, such as learning rate and momentum
- Return type:
scheduler
- dingo.core.utils.torchutils.perform_scheduler_step(scheduler, loss=None)
Wrapper for scheduler.step(). If scheduler is ReduceLROnPlateau, then scheduler.step(loss) is called, if not, scheduler.step().
- Parameters:
scheduler – scheduler for learning rate
loss – validation loss
- dingo.core.utils.torchutils.set_requires_grad_flag(model, name_startswith=None, name_contains=None, requires_grad=True)
Set param.requires_grad of all model parameters with a name starting with name_startswith, or name containing name_contains, to requires_grad.
- dingo.core.utils.torchutils.split_dataset_into_train_and_test(dataset, train_fraction)
Splits dataset into a trainset of size int(train_fraction * len(dataset)), and a testset with the remainder. Uses fixed random seed for reproducibility.
- Parameters:
dataset (torch.utils.data.Datset) – dataset to be split
train_fraction (float) – fraction of the dataset to be used for trainset
- Return type:
trainset, testset
- dingo.core.utils.torchutils.torch_detach_to_cpu(x)
dingo.core.utils.trainutils module
- class dingo.core.utils.trainutils.LossInfo(epoch, len_dataset, batch_size, mode='Train', print_freq=1)
Bases:
object
- get_avg()
- print_info(batch_idx)
- update(loss, n)
- update_timer(timer_mode='Dataloader')
- class dingo.core.utils.trainutils.RuntimeLimits(max_time_per_run: float | None = None, max_epochs_per_run: int | None = None, max_epochs_total: int | None = None, epoch_start: int | None = None)
Bases:
object
Keeps track of the runtime limits (time limit, epoch limit, max. number of epochs for model).
- Parameters:
max_time_per_run (float = None) – maximum time for run, in seconds [soft limit, break only after full epoch]
max_epochs_per_run (int = None) – maximum number of epochs for run
max_epochs_total (int = None) – maximum total number of epochs for model
epoch_start (int = None) – start epoch of run
- limits_exceeded(epoch: int | None = None)
Check whether any of the runtime limits are exceeded.
- Parameters:
epoch (int = None) –
- Returns:
limits_exceeded – flag whether runtime limits are exceeded and run should be stopped; if limits_exceeded = True, this prints a message for the reason
- Return type:
bool
- local_limits_exceeded(epoch: int | None = None)
Check whether any of the local runtime limits are exceeded. Local runtime limits include max_epochs_per_run and max_time_per_run, but not max_epochs_total.
- Parameters:
epoch (int = None) –
- Returns:
limits_exceeded – flag whether local runtime limits are exceeded
- Return type:
bool
- dingo.core.utils.trainutils.copyfile(src, dst)
copy src to dst. :param src: :param dst: :return:
- dingo.core.utils.trainutils.save_model(pm, log_dir, model_prefix='model', checkpoint_epochs=None)
Save model to <model_prefix>_latest.pt in log_dir. Additionally, all checkpoint_epochs a permanent checkpoint is saved.
- Parameters:
pm – model to be saved
log_dir (str) – log directory, where model is saved
model_prefix (str = 'model') – prefix for name of save model
checkpoint_epochs (int = None) – number of steps between two consecutive model checkpoints
- dingo.core.utils.trainutils.write_history(log_dir, epoch, train_loss, test_loss, learning_rates, aux=None, filename='history.txt')
Writes losses and learning rate history to csv file.
- Parameters:
log_dir (str) – directory containing the history file
epoch (int) – epoch
train_loss (float) – train_loss of epoch
test_loss (float) – test_loss of epoch
learning_rates (list) – list of learning rates in epoch
aux (list = []) – list of auxiliary information to be logged
filename (str = 'history.txt') – name of history file
Module contents
Submodules
dingo.core.dataset module
- class dingo.core.dataset.DingoDataset(file_name=None, dictionary=None, data_keys=None)
Bases:
object
This is a generic dataset class with save / load methods.
A common use case is to inherit multiply from DingoDataset and torch.utils.data.Dataset, in which case the subclass picks up these I/O methods, and DingoDataset is acting as a Mixin class.
Alternatively, if the torch Dataset is not needed, then DingoDataset can be subclassed directly.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- dataset_type = 'dingo_dataset'
- from_dictionary(dictionary: dict)
- from_file(file_name)
- to_dictionary()
- to_file(file_name, mode='w')
- dingo.core.dataset.recursive_hdf5_load(group, keys=None)
- dingo.core.dataset.recursive_hdf5_save(group, d)
dingo.core.likelihood module
- class dingo.core.likelihood.Likelihood
Bases:
object
- log_likelihood(theta)
- log_likelihood_multi(theta: DataFrame, num_processes: int = 1) ndarray
Calculate the log likelihood at multiple points in parameter space. Works with multiprocessing.
This wraps the log_likelihood() method.
- Parameters:
theta (pd.DataFrame) – Parameters values at which to evaluate likelihood.
num_processes (int) – Number of processes to use.
- Return type:
np.array of log likelihoods
dingo.core.multiprocessing module
- dingo.core.multiprocessing.apply_func_with_multiprocessing(func: callable, theta: DataFrame, num_processes: int = 1) ndarray
Call func(theta.iloc[idx].to_dict()) with multiprocessing.
- Parameters:
func (callable) –
theta (pd.DataFrame) – Parameters with multiple rows, evaluate func for each row.
num_processes (int) – Number of parallel processes to use.
- Returns:
result – Output array, where result[idx] = func(theta.iloc[idx].to_dict())
- Return type:
np.ndarray
dingo.core.result module
- class dingo.core.result.Result(file_name=None, dictionary=None)
Bases:
DingoDataset
A dataset class to hold a collection of samples, implementing I/O, importance sampling, and unconditional flow training.
- Attributes:
- samplespd.Dataframe
Contains parameter samples, as well as (possibly) log_prob, log_likelihood, weights, log_prior, delta_log_prob_target.
- domainDomain
Should be implemented in a subclass.
- priorPriorDict
Should be implemented in a subclass.
- likelihoodLikelihood
Should be implemented in a subclass.
- contextdict
Context data from which the samples were produced (e.g., strain data, ASDs).
metadata : dict event_metadata : dict log_evidence : float log_evidence_std : float (property) effective_sample_size, n_eff : float (property) sample_efficiency : float (property)
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- property base_metadata
- property constraint_parameter_keys
- dataset_type = 'core_result'
- property effective_sample_size
- property fixed_parameter_keys
- importance_sample(num_processes: int = 1, **likelihood_kwargs)
Calculate importance weights for samples.
Importance sampling starts with samples have been generated from a proposal distribution q(theta), in this case a neural network model. Certain networks (i.e., non-GNPE) also provide the log probability of each sample, which is required for importance sampling.
Given the proposal, we re-weight samples according to the (un-normalized) target distribution, which we take to be the likelihood L(theta) times the prior pi(theta). This gives sample weights
w(theta) ~ pi(theta) L(theta) / q(theta),
where the overall normalization does not matter (and we take to have mean 1). Since q(theta) enters this expression, importance sampling is only possible when we know the log probability of each sample.
As byproducts, this method also estimates the evidence and effective sample size of the importance sampled points.
This method modifies the samples pd.DataFrame in-place, adding new columns for log_likelihood, log_prior, and weights. It also stores the log_evidence as an attribute.
- Parameters:
num_processes (int) – Number of parallel processes to use when calculating likelihoods. (This is the most expensive task.)
likelihood_kwargs (dict) – kwargs that are forwarded to the likelihood constructor. E.g., options for marginalization.
- property injection_parameters
- property log_bayes_factor
- property log_evidence_std
- classmethod merge(parts)
Merge several Result instances into one. Check that they are compatible, in the sense of having the same metadata. Finally, calculate a new log evidence for the combined result.
This is useful when recombining separate importance sampling jobs.
- Parameters:
parts (list[Result]) – List of sub-Results to be combined.
- Return type:
Combined Result.
- property metadata
- property n_eff
- property num_samples
- parameter_subset(parameters)
Return a new object of the same type, with only a subset of parameters. Drops all other columns in samples DataFrame as well (e.g., log_prob, weights).
- Parameters:
parameters (list) – List of parameters to keep.
- Return type:
- plot_corner(parameters=None, filename='corner.pdf')
Generate a corner plot of the samples.
- Parameters:
parameters (list[str]) – List of parameters to include. If None, include all parameters. (Default: None)
filename (str) – Where to save samples.
- plot_log_probs(filename='log_probs.png')
Make a scatter plot of the target versus proposal log probabilities. For the target, subtract off the log evidence.
- plot_weights(filename='weights.png')
Make a scatter plot of samples weights vs log proposal.
- print_summary()
Display the number of samples, and (if importance sampling is complete) the log evidence and number of effective samples.
- reset_event(event_dataset)
Set the Result context and event_metadata based on an EventDataset.
If these attributes already exist, perform a comparison to check for changes. Update relevant objects appropriately. Note that setting context and event_metadata attributes directly would not perform these additional checks and updates.
- Parameters:
event_dataset (EventDataset) – New event to be used for importance sampling.
- property sample_efficiency
- sampling_importance_resampling(num_samples=None, random_state=None)
Generate unweighted posterior samples from weighted ones. New samples are sampled with probability proportional to the sample weight. Resampling is done with replacement, until the desired number of unweighted samples is obtained.
- Parameters:
num_samples (int) – Number of samples to resample.
random_state (int or None) – Sampling seed.
- Returns:
Unweighted samples
- Return type:
pd.Dataframe
- property search_parameter_keys
- split(num_parts)
Split the Result into a set of smaller results. The samples are evenly divided among the sub-results. Additional information (metadata, context, etc.) are copied into each.
This is useful for splitting expensive tasks such as importance sampling across multiple jobs.
- Parameters:
num_parts (int) – The number of parts to split the Result across.
- Return type:
list of sub-Results.
- train_unconditional_flow(parameters, nde_settings: dict, train_dir: str | None = None, threshold_std: float | None = inf)
Train an unconditional flow to represent the distribution of self.samples.
- Parameters:
parameters (list) – List of parameters over which to train the flow. Can be a subset of the existing parameters.
nde_settings (dict) – Configuration settings for the neural density estimator.
train_dir (Optional[str]) – Where to save the output of network training, e.g., logs, checkpoints. If not provide, a temporary directory is used.
threshold_std (Optional[float]) – Drop samples more than threshold_std standard deviations away from the mean (in any parameter) before training the flow. This is meant to remove outlier samples.
- Return type:
- dingo.core.result.check_equal_dict_of_arrays(a, b)
- dingo.core.result.freeze(d)
dingo.core.samplers module
- class dingo.core.samplers.GNPESampler(model: PosteriorModel, init_sampler: Sampler, num_iterations: int = 1)
Bases:
Sampler
Base class for GNPE sampler. It wraps a PosteriorModel and a standard Sampler for initialization. The former is used to generate initial samples for Gibbs sampling.
A GNPE network is conditioned on additional “proxy” context theta^, i.e.,
p(theta | theta^, d)
The theta^ depend on theta via a fixed kernel p(theta^ | theta). Combining these known distributions, this class uses Gibbs sampling to draw samples from the joint distribution,
p(theta, theta^ | d)
The advantage of this approach is that we are allowed to perform any transformation of d that depends on theta^. In particular, we can use this freedom to simplify the data, e.g., by aligning data to have merger times = 0 in each detector. The merger times are unknown quantities that must be inferred jointly with all other parameters, and GNPE provides a means to do this iteratively. See https://arxiv.org/abs/2111.13139 for additional details.
Gibbs sampling breaks access to the probability density, so this must be recovered through other means. One way is to train an unconditional flow to represent p(theta^ | d) for fixed d based on the samples produced through the GNPE Gibbs sampling. Starting from these, a single Gibbs iteration gives theta from the GNPE network, along with the probability density in the joint space. This is implemented in GNPESampler provided the init_sampler provides proxies directly and num_iterations = 1.
Attributes (beyond those of Sampler)
- init_samplerSampler
Used for providing initial samples for Gibbs sampling.
- num_iterationsint
Number of Gibbs iterations to perform.
iteration_tracker : IterationTracker not set up remove_init_outliers : float not set up
- param model:
- type model:
PosteriorModel
- param init_sampler:
Used for generating initial samples
- type init_sampler:
Sampler
- param num_iterations:
Number of GNPE iterations to be performed by sampler.
- type num_iterations:
int
- property gnpe_proxy_parameters
- property init_sampler
- property num_iterations
The number of GNPE iterations to perform when sampling.
- class dingo.core.samplers.Sampler(model: PosteriorModel)
Bases:
object
Sampler class that wraps a PosteriorModel. Allows for conditional and unconditional models.
Draws samples from the model based on (optional) context data.
This is intended for use either as a standalone sampler, or as a sampler producing initial sample points for a GNPE sampler.
- run_sampler()
- log_prob()
- to_result()
- to_hdf5()
- model
- Type:
- inference_parameters
- Type:
list
- samples
Samples produced from the model by run_sampler().
- Type:
DataFrame
- context
- Type:
dict
- metadata
- Type:
dict
- event_metadata
- Type:
dict
- unconditional_model
Whether the model is unconditional, in which case it is not provided context information.
- Type:
bool
- transform_pre, transform_post
Transforms to be applied to data and parameters during inference. These are typically implemented in a subclass.
- Type:
Transform
- Parameters:
model (PosteriorModel) –
- property context
Data on which to condition the sampler. For injections, there should be a ‘parameters’ key with truth values.
- property event_metadata
Metadata for data analyzed. Can in principle influence any post-sampling parameter transformations (e.g., sky position correction), as well as the likelihood detector positions.
- log_prob(samples: DataFrame) ndarray
Calculate the model log probability at specific sample points.
- Parameters:
samples (pd.DataFrame) – Sample points at which to calculate the log probability.
- Return type:
np.array of log probabilities.
- run_sampler(num_samples: int, batch_size: int | None = None)
Generates samples and stores them in self.samples. Conditions the model on self.context if appropriate (i.e., if the model is not unconditional).
If possible, it also calculates the log_prob and saves it as a column in self.samples. When using GNPE it is not possible to obtain the log_prob due to the many Gibbs iterations. However, in the case of just one iteration, and when starting from a sampler for the proxy, the GNPESampler does calculate the log_prob.
Allows for batched sampling, e.g., if limited by GPU memory. Actual sampling for each batch is performed by _run_sampler(), which will differ for Sampler and GNPESampler.
- Parameters:
num_samples (int) – Number of samples requested.
batch_size (int, optional) – Batch size for sampler.
- to_hdf5(label='result', outdir='.')
- to_result() Result
Export samples, metadata, and context information to a Result instance, which can be used for saving or, e.g., importance sampling, training an unconditional flow, etc.
- Return type:
- write_pesummary(filename)
dingo.core.transforms module
- class dingo.core.transforms.GetItem(key)
Bases:
object
- class dingo.core.transforms.RenameKey(old, new)
Bases:
object
Module contents
dingo.gw package
Subpackages
dingo.gw.conversion package
Submodules
dingo.gw.conversion.spin_conversion module
- dingo.gw.conversion.spin_conversion.cartesian_spins(p, f_ref)
Transform PE spins to cartesian spins.
- Parameters:
p (dict) – contains parameters, including PE spins
f_ref (float) – reference frequency for definition of spins
- Returns:
result – parameters, including cartesian spins
- Return type:
dict
- dingo.gw.conversion.spin_conversion.change_spin_conversion_phase(samples, f_ref, sc_phase_old, sc_phase_new)
Change the phase used to convert cartesian spins to PE spins. The cartesian spins are independent of the spin conversion phase. When converting from cartesian spins to PE spins, the phase value has an impact on theta_jn and phi_jl.
The usual convention for the PE spins is to use the phase parameter for the conversion (cart. spins <–> PE spins), but for dingo-IS with the synthetic phase extension we need to use another convention, where the PE spins are defined with spin conversion phase 0. This function transforms between the different conventions.
- Parameters:
samples (pd.Dataframe) – Parameters.
f_ref (float) – Reference frequency for definition of spins.
sc_phase_old (float or None) – Spin conversion phase used for input parameters. If None, use the phase parameter.
sc_phase_new (float or None) – Spin conversion phase used for output parameters. If None, use the phase parameter.
- Returns:
parameters with changed spin conversion phase
- Return type:
p_new
- dingo.gw.conversion.spin_conversion.component_masses(p)
- dingo.gw.conversion.spin_conversion.pe_spins(p, f_ref)
Transform cartesian spins to PE spins.
- Parameters:
p (dict) – contains parameters, including cartesian spins
f_ref (float) – reference frequency for definition of spins
- Returns:
result – parameters, including PE spins
- Return type:
dict
Module contents
dingo.gw.data package
Submodules
dingo.gw.data.data_download module
- dingo.gw.data.data_download.download_psd(det, time_start, time_psd, window, f_s)
Download strain data and generate a PSD based on these. Use num_segments of length time_segment, starting at GPS time time_start.
- Parameters:
det (str) – detector
time_start (float) – start GPS time for PSD estimation
time_psd (float = 1024) – time in seconds for strain used for PSD generation
window (Union(np.ndarray, dict)) – Window used for PSD generation, needs to be the same as used for Fourier transform of event strain data. Provided as dict, window is generated by window = dingo.gw.gwutils.get_window( **window).
f_s (float) – sampling rate of strain data
- Returns:
psd – array of psd
- Return type:
np.array
- dingo.gw.data.data_download.download_raw_data(time_event, time_segment, time_psd, time_buffer, detectors, window, f_s)
dingo.gw.data.data_preparation module
- dingo.gw.data.data_preparation.data_to_domain(raw_data, settings_raw_data, domain, **kwargs)
- Parameters:
raw_data –
settings_raw_data –
model_metadata –
- Returns:
data – dict with domain_data
- Return type:
dict
- dingo.gw.data.data_preparation.get_event_data_and_domain(model_metadata, time_event, time_psd, time_buffer, event_dataset=None)
- dingo.gw.data.data_preparation.load_raw_data(time_event, settings, event_dataset=None)
Load raw event data.
If event_dataset is provided and event data is saved in it, load and return the data
Else, event data is downloaded. If event_dataset is provided, the event data is additionally saved to the file.
- Parameters:
time_event (float) – gps time of the events
settings (dict) – dict with the settings
event_dataset (str) – name of the event dataset file
- dingo.gw.data.data_preparation.parse_settings_for_raw_data(model_metadata, time_psd, time_buffer)
dingo.gw.data.event_dataset module
- class dingo.gw.data.event_dataset.EventDataset(file_name=None, dictionary=None)
Bases:
DingoDataset
Dataset class for storing single event.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- dataset_type = 'event_dataset'
Module contents
dingo.gw.dataset package
Submodules
dingo.gw.dataset.generate_dataset module
- dingo.gw.dataset.generate_dataset.generate_dataset(settings: Dict, num_processes: int) WaveformDataset
Generate a waveform dataset.
- Parameters:
settings (dict) – Dictionary of settings to configure the dataset
num_processes (int) –
- Return type:
A WaveformDataset based on the settings.
- dingo.gw.dataset.generate_dataset.generate_parameters_and_polarizations(waveform_generator: WaveformGenerator, prior: BBHPriorDict, num_samples: int, num_processes: int) Tuple[DataFrame, Dict[str, ndarray]]
Generate a dataset of waveforms based on parameters drawn from the prior.
- Parameters:
waveform_generator (WaveformGenerator) –
prior (Prior) –
num_samples (int) –
num_processes (int) –
- Returns:
pandas DataFrame of parameters
dictionary of numpy arrays corresponding to waveform polarizations
- dingo.gw.dataset.generate_dataset.main()
- dingo.gw.dataset.generate_dataset.parse_args()
- dingo.gw.dataset.generate_dataset.train_svd_basis(dataset: WaveformDataset, size: int, n_train: int)
Train (and optionally validate) an SVD basis.
- Parameters:
dataset (WaveformDataset) – Contains waveforms to be used for building SVD.
size (int) – Number of elements to keep for the SVD basis.
n_train (int) – Number of training waveforms to use. Remaining are used for validation. Note that the actual number of training waveforms is n_train * len(polarizations), since there is one waveform used for each polarization.
- Returns:
Since EOB waveforms can fail to generate, provide also the number used in training and validation.
- Return type:
SVDBasis, n_train, n_test
dingo.gw.dataset.generate_dataset_dag module
- dingo.gw.dataset.generate_dataset_dag.configure_runs(settings, num_jobs, temp_dir)
Prepare and save settings .yaml files for generating subsets of the dataset. Generally this will produce two .yaml files, one for generating the main dataset, one for the SVD training.
- Parameters:
settings (dict) – Settings for full dataset configuration.
num_jobs (int) – Number of jobs over which to split the run.
temp_dir (str) – Name of (temporary) directory in which to place temporary output files.
- dingo.gw.dataset.generate_dataset_dag.create_args_string(args_dict: Dict)
Generate argument string from dictionary of argument names and arguments.
- dingo.gw.dataset.generate_dataset_dag.create_dag(args, settings)
Create a Condor DAG from command line arguments to carry out the five steps in the workflow.
- dingo.gw.dataset.generate_dataset_dag.main()
- dingo.gw.dataset.generate_dataset_dag.modulus_check(a: int, b: int, a_label: str, b_label: str)
Raise error if a % b != 0.
- dingo.gw.dataset.generate_dataset_dag.parse_args()
dingo.gw.dataset.utils module
- dingo.gw.dataset.utils.build_svd_cli()
Command-line function to build an SVD based on an uncompressed dataset file.
- dingo.gw.dataset.utils.merge_datasets(dataset_list: List[WaveformDataset]) WaveformDataset
Merge a collection of datasets into one.
- Parameters:
dataset_list (list[WaveformDataset]) – A list of WaveformDatasets. Each item should be a dictionary containing parameters and polarizations.
- Return type:
WaveformDataset containing the merged data.
- dingo.gw.dataset.utils.merge_datasets_cli()
Command-line function to combine a collection of datasets into one. Used for parallelized waveform generation.
dingo.gw.dataset.waveform_dataset module
- class dingo.gw.dataset.waveform_dataset.WaveformDataset(file_name=None, dictionary=None, transform=None, precision=None, domain_update=None, svd_size_update=None)
Bases:
DingoDataset
,Dataset
This class stores a dataset of waveforms (polarizations) and corresponding parameters.
It can load the dataset either from an HDF5 file or suitable dictionary.
Once a waveform data set is in memory, the waveform data are consumed through a __getitem__() call, optionally applying a chain of transformations, which are classes that implement a __call__() method.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The dictionary keys should be ‘settings’, ‘parameters’, and ‘polarizations’.
transform (Transform) – Transform to be applied to dataset samples when accessed through __getitem__
precision (str ('single', 'double')) – If provided, changes precision of loaded dataset.
domain_update (dict) – If provided, update domain from existing domain using new settings.
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- dataset_type = 'waveform_dataset'
- initialize_decompression(svd_size_update: int | None = None)
Sets up decompression transforms. These are applied to the raw dataset before self.transform. E.g., SVD decompression.
- Parameters:
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- load_supplemental(domain_update=None, svd_size_update=None)
Method called immediately after loading a dataset.
Creates (and possibly updates) domain, updates dtypes, and initializes any decompression transform. Also zeros data below f_min, and truncates above f_max.
- Parameters:
domain_update (dict) – If provided, update domain from existing domain using new settings.
svd_size_update (int) – If provided, reduces the SVD size when decompressing (for speed).
- parameter_mean_std()
- update_domain(domain_update: dict | None = None)
Update the domain based on new configuration.
The waveform dataset provides waveform polarizations in a particular domain. In Frequency domain, this is [0, domain._f_max]. Furthermore, data is set to 0 below domain._f_min. In practice one may want to train a network based on slightly different domain settings, which corresponds to truncating the likelihood integral.
This method provides functionality for that. It truncates and/or zeroes the dataset to the range specified by the domain, by calling domain.update_data.
- Parameters:
domain_update (dict) – Settings dictionary. Must contain a subset of the keys contained in domain_dict.
Module contents
dingo.gw.importance_sampling package
Submodules
dingo.gw.importance_sampling.diagnostics module
- dingo.gw.importance_sampling.diagnostics.plot_diagnostics(result: Result, outdir, num_processes=1, num_slice_plots=0, n_grid_slice1d=200, n_grid_slice2d=100, params_slice2d=None)
- dingo.gw.importance_sampling.diagnostics.plot_posterior_slice(sampler, theta, theta_range, outname=None, num_processes=1, n_grid=200, parameters=None, normalize_conditionals=False)
- dingo.gw.importance_sampling.diagnostics.plot_posterior_slice2d(sampler, theta, theta_range, n_grid=100, num_processes=1, outname=None)
dingo.gw.importance_sampling.importance_weights module
Step 1: Train unconditional nde Step 2: Set up likelihood and prior
- dingo.gw.importance_sampling.importance_weights.main()
- dingo.gw.importance_sampling.importance_weights.parse_args()
Module contents
Implements sampling-importance-resampling (sir) for GW posteriors.
dingo.gw.inference package
Submodules
dingo.gw.inference.gw_samplers module
- class dingo.gw.inference.gw_samplers.GWSampler(**kwargs)
Bases:
GWSamplerMixin
,Sampler
Sampler for gravitational-wave inference using neural posterior estimation. Augments the base class by defining transform_pre and transform_post to prepare data for the inference network.
- transform_pre :
Whitens strain.
Repackages strain data and the inverse ASDs (suitably scaled) into a torch tensor.
- transform_post :
Extract the desired inference parameters from the network output ( array-like), de-standardize them, and repackage as a dict.
Also mixes in GW functionality for building the domain and correcting the reference time.
Allows for conditional and unconditional models, and draws samples from the model based on (optional) context data.
This is intended for use either as a standalone sampler, or as a sampler producing initial sample points for a GNPE sampler.
- Parameters:
kwargs – Keyword arguments that are forwarded to the superclass.
- class dingo.gw.inference.gw_samplers.GWSamplerGNPE(**kwargs)
Bases:
GWSamplerMixin
,GNPESampler
Gravitational-wave GNPE sampler. It wraps a PosteriorModel and a standard Sampler for initialization. The former is used to generate initial samples for Gibbs sampling.
Compared to the base class, this class implements the required transforms for preparing data and parameters for the network. This includes GNPE transforms, data processing transforms, and standardization/de-standardization of parameters.
A GNPE network is conditioned on additional “proxy” context theta^, i.e.,
p(theta | theta^, d)
The theta^ depend on theta via a fixed kernel p(theta^ | theta). Combining these known distributions, this class uses Gibbs sampling to draw samples from the joint distribution,
p(theta, theta^ | d)
The advantage of this approach is that we are allowed to perform any transformation of d that depends on theta^. In particular, we can use this freedom to simplify the data, e.g., by aligning data to have merger times = 0 in each detector. The merger times are unknown quantities that must be inferred jointly with all other parameters, and GNPE provides a means to do this iteratively. See https://arxiv.org/abs/2111.13139 for additional details.
Gibbs sampling breaks access to the probability density, so this must be recovered through other means. One way is to train an unconditional flow to represent p(theta^ | d) for fixed d based on the samples produced through the GNPE Gibbs sampling. Starting from these, a single Gibbs iteration gives theta from the GNPE network, along with the probability density in the joint space. This is implemented in GNPESampler provided the init_sampler provides proxies directly and num_iterations = 1.
Attributes (beyond those of Sampler)
- init_samplerSampler
Used for providing initial samples for Gibbs sampling.
- num_iterationsint
Number of Gibbs iterations to perform.
- iteration_trackerIterationTracker
not set up
- remove_init_outliersfloat
not set up
- param kwargs:
Keyword arguments that are forwarded to the superclass.
- class dingo.gw.inference.gw_samplers.GWSamplerMixin(**kwargs)
Bases:
object
- Mixin class designed to add gravitational wave functionality to Sampler classes:
builder for data domain
correction for fixed detector locations during training (t_ref)
- Parameters:
kwargs – Keyword arguments that are forwarded to the superclass.
dingo.gw.inference.inference_pipeline module
- dingo.gw.inference.inference_pipeline.analyze_event()
- dingo.gw.inference.inference_pipeline.get_event_data(event, args, model, ref=None)
- dingo.gw.inference.inference_pipeline.parse_args()
- dingo.gw.inference.inference_pipeline.prepare_log_prob(sampler, num_samples: int, nde_settings: dict, batch_size: int | None = None, threshold_std: float | None = inf, remove_init_outliers: float | None = 0.0, low_latency_label: str | None = None, outdir: str | None = None)
Prepare gnpe sampling with log_prob. This is required, since in its vanilla form gnpe does not provide the density for its samples.
Specifically, we train an unconditional neural density estimator (nde) for the gnpe proxies. This requires running the gnpe sampler till convergence, and extracting the gnpe proxies after the final gnpe iteration. The nde is trained to match the distribution over gnpe proxies, which provides a way of rapidly sampling (converged!) gnpe proxies and evaluating the log_prob.
After this preparation step, self.run_sampler can leverage self.gnpe_proxy_sampler (which is based on the aforementioned trained nde) to sample gnpe proxies, such that one gnpe iteration is sufficient. The log_prob of the samples in the joint space (inference parameters + gnpe proxies) is then simply given by the sum of the corresponding log_probs (from self.model and self.gnpe_proxy_sampler.model).
- Parameters:
num_samples (int) – number of samples for training of nde
batch_size (int = None) – batch size for sampler
threshold_std (float = np.inf) – gnpe proxies deviating by more then threshold_std standard deviations from the proxy mean (along any axis) are discarded.
low_latency_label (str = None) – File label for low latency samples (= samples used for training nde). If None, these samples are not saved.
outdir (str = None) – Directory in which low latency samples are saved. Needs to be set if low_latency_label is not None.
dingo.gw.inference.visualization module
- dingo.gw.inference.visualization.generate_cornerplot(*sample_sets, filename=None)
- dingo.gw.inference.visualization.load_ref_samples(ref_samples_file, drop_geocent_time=True)
Module contents
dingo.gw.noise package
Subpackages
dingo.gw.noise.synthetic package
Submodules
dingo.gw.noise.synthetic.asd_parameterization module
- dingo.gw.noise.synthetic.asd_parameterization.curve_fit(data, std, delta_f=None)
Fit a Lorentzian to the PSD.
- Parameters:
data (dict) – Dictionary containing the PSD, broadband noise, and frequency grid.
std (float) – Standard deviation of the Gaussian noise.
delta_f (float) – Truncation parameter for Lorentzians. Set to None if non-positive value is passed.
- dingo.gw.noise.synthetic.asd_parameterization.fit_broadband_noise(domain, psd, num_spline_positions, sigma, f_min=20)
Fit a spline to the broadband noise of a PSD.
- Parameters:
domain (Domain) – Domain object containing the frequency grid.
psd (array_like) – PSD to be parameterized.
num_spline_positions (int) – Number of spline positions.
sigma (float) – Standard deviation of the Gaussian noise used for the spline fit.
f_min (float, optional) – position of the first node for the spline fi
- dingo.gw.noise.synthetic.asd_parameterization.fit_spectral(frequencies, psd, broadband_noise, num_spectral_segments, sigma, delta_f)
Fit Lorentzians to the spectral features of a PSD.
- Parameters:
frequencies (array_like) – Frequency grid.
psd (array_like) – PSD to be parameterized.
broadband_noise (array_like) – Broadband noise of the PSD.
num_spectral_segments (int) – Number of spectral segments.
sigma (float) – Standard deviation of the Gaussian noise used for the spline fit.
delta_f (float) – Truncation parameter for Lorentzians. Set to None if non-positive value is passed.
- dingo.gw.noise.synthetic.asd_parameterization.parameterize_asd_dataset(real_dataset, parameterization_settings, num_processes, verbose)
Parameterize a dataset of ASDs using a spline fit to the broadband noise and Lorentzians for the spectral features.
- Parameters:
real_dataset (ASDDataset) – Dataset containing the ASDs to be parameterized.
parameterization_settings (dict) – Dictionary containing the settings for the parameterization.
num_processes (int) – Number of processes to use for parallelization.
verbose (bool) – If True, print progress bars.
- dingo.gw.noise.synthetic.asd_parameterization.parameterize_asds_parallel(asds, domain, parameterization_settings, pool=None, verbose=False)
Helper function to be called for parallel ASD parameterization.
- Parameters:
asds (array_like) – Array containing the ASDs to be parameterized.
domain (Domain) – Domain object containing the frequency grid.
parameterization_settings (dict) – Dictionary containing the settings for the parameterization.
pool (Pool, optional) – Pool object for parallelization. If None, the function is not parallelized.
verbose (bool) – If True, print progress bars.
- dingo.gw.noise.synthetic.asd_parameterization.parameterize_single_psd(real_psd, domain, parameterization_settings)
Parameterize a single ASD using a spline fit to the broadband noise and Lorentzians for the spectral features.
- Parameters:
real_psd (array_like) – PSD to be parameterized.
domain (Domain) – Domain object containing the frequency grid.
parameterization_settings (dict) – Dictionary containing the settings for the parameterization.
dingo.gw.noise.synthetic.asd_sampling module
- class dingo.gw.noise.synthetic.asd_sampling.KDE(parameter_dict, sampling_settings)
Bases:
object
Kernel Density Estimation (KDE) class for sampling ASDs.
- Parameters:
parameter_dict (dict) – Dictionary containing the parameters of the ASDs used for fitting the synthetic distribution.
sampling_settings (dict) – Dictionary containing the settings for the sampling.
- fit(weights=None)
Fit the KDEs to the parameters saved in ‘self.parameter_dict’. :param weights: Weights for the KDEs. If None, all weights are set to 1. :type weights: array_like, optional
- dingo.gw.noise.synthetic.asd_sampling.get_rescaling_params(filenames, parameterization_settings)
Get the parameters of the ASDs that are used for rescaling. :param filenames: Dictionary containing the paths to the ASD files. :type filenames: dict :param parameterization_settings: Dictionary containing the settings for the parameterization. :type parameterization_settings: dict
dingo.gw.noise.synthetic.generate_dataset module
- dingo.gw.noise.synthetic.generate_dataset.generate_dataset(real_dataset, settings: Dict, num_samples, num_processes: int, verbose: bool)
Generate a synthetic ASD dataset from an existing dataset of real ASDs.
- Parameters:
real_dataset (ASDDataset) – Existing dataset of real ASDs.
settings (dict) – Dictionary containing the settings for the parameterization and sampling.
num_processes (int) – Number of processes to use in pool for parallel parameterization.
verbose (bool) – Whether to print progress information.
- dingo.gw.noise.synthetic.generate_dataset.main()
- dingo.gw.noise.synthetic.generate_dataset.parse_args()
dingo.gw.noise.synthetic.utils module
- dingo.gw.noise.synthetic.utils.get_index_for_elem(arr, elem)
- dingo.gw.noise.synthetic.utils.lorentzian_eval(x, f0, A, Q, delta_f=None)
Evaluates a Lorentzian function at the given frequencies. :param x: Frequencies at which the Lorentzian is evaluated. :type x: array_like :param f0: Center frequency of the Lorentzian. :type f0: float :param A: Amplitude of the Lorentzian. :type A: float :param Q: Parameter determining the width of the Lorentzian :type Q: float :param delta_f: If given, the Lorentzian is truncated :type delta_f: float, optional
- Return type:
array_like
- dingo.gw.noise.synthetic.utils.reconstruct_psds_from_parameters(parameters_dict, domain, parameterization_settings)
Reconstructs the PSDs from the parameters. :param parameters_dict: Dictionary containing the parameters of the PSDs. :type parameters_dict: dict :param domain: Domain object containing the frequencies at which the PSDs are evaluated. :type domain: dingo.gw.noise.domain.Domain :param parameterization_settings: Dictionary containing the settings for the parameterization. :type parameterization_settings: dict
- Return type:
array_like
Module contents
Submodules
dingo.gw.noise.asd_dataset module
- class dingo.gw.noise.asd_dataset.ASDDataset(file_name=None, dictionary=None, ifos=None, precision=None, domain_update=None)
Bases:
DingoDataset
Dataset of amplitude spectral densities (ASDs). The ASDs are typically used for whitening strain data, and additionally passed as context to the neural density estimator.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The dictionary keys should be ‘settings’, ‘asds’, and ‘gps_times’.
ifos (List[str]) – List of detectors used for dataset, e.g. [‘H1’, ‘L1’]. If not set, all available ones in the dataset are used.
precision (str ('single', 'double')) – If provided, changes precision of loaded dataset.
domain_update (dict) – If provided, update domain from existing domain using new settings.
- dataset_type = 'asd_dataset'
- property gps_info
Min/Max GPS time for each detector.
- property length_info
The number of asd samples per detector.
- sample_random_asds()
Sample a random asd for each detector. :rtype: Dict with a random asd from the dataset for each detector.
- update_domain(domain_update)
Update the domain based on new configuration. Also adjust data arrays to match the new domain.
The ASD dataset provides ASDs in a particular domain. In Frequency domain, this is [0, domain._f_max]. In practice one may want to train a network based on slightly different domain settings, which corresponds to truncating the likelihood integral.
This method provides functionality for that. It truncates the data below a new f_max, and sets the ASD below f_min to a large but finite value.
- Parameters:
domain_update (dict) – Settings dictionary. Must contain a subset of the keys contained in domain_dict.
dingo.gw.noise.asd_estimation module
- dingo.gw.noise.asd_estimation.download_and_estimate_cli()
Command-line function to download strain data and estimate PSDs based on the data. Used for parallelized ASD dataset generation.
- dingo.gw.noise.asd_estimation.download_and_estimate_psds(data_dir: str, settings: dict, time_segments: dict, verbose=False)
Downloads strain data for the specified time segments and estimates PSDs based on these
- Parameters:
data_dir (str) – Path to the directory where the PSD dataset will be stored
settings (dict) – Settings that determine the segments
time_segments (dict) – specifying the time segments used for downloading the data
verbose (bool) – optional parameter determining if progress should be printed
- Return type:
A dictionary containing the paths to the dataset files
dingo.gw.noise.generate_dataset module
- dingo.gw.noise.generate_dataset.generate_dataset()
Creates and saves an ASD dataset
- dingo.gw.noise.generate_dataset.parse_args()
dingo.gw.noise.generate_dataset_dag module
- dingo.gw.noise.generate_dataset_dag.create_args_string(args_dict: Dict)
Generate argument string from dictionary of argument names and arguments.
- dingo.gw.noise.generate_dataset_dag.create_dag(data_dir, settings_file, time_segments, out_name)
Create a Condor DAG to (a) download, estimate, individual PSDs and (b) merge them into one dataset
- Parameters:
data_dir (str) – Path to the directory where the PSD dataset will be stored
settings_file (str) – Settings : Path to settings file relevant for PSD generation
time_segments (dict) – contains all time segments used for estimating PSDs
out_name (str) – path where the resulting ASD dataset should be stored
- Return type:
Condor DAG
- dingo.gw.noise.generate_dataset_dag.split_time_segments(time_segments, condor_dir, num_jobs)
Split up all time segments used for estimating PSDs into num_jobs-many segments and save them into a condor directory
- Parameters:
time_segments (dict) – contains all time segments used for estimating PSDs
condor_dir (str) – path to a directory where condr-related files are stored
num_jobs (int) – number of jobs that should be used per detector to parallelize the PSD estimation
- Return type:
List of paths where the files including the subsets of all time segments are stored
dingo.gw.noise.utils module
- dingo.gw.noise.utils.CATALOGS = ['GWTC-1-confident', 'GWTC-2.1-confident', 'GWTC-3-confident']
Contains links for PSD segment lists with quality label BURST_CAT2 from the Gravitational Wave Open Science Center. Some events are split up into multiple chunks such that there are multiple URLs for one observing run
- dingo.gw.noise.utils.get_event_gps_times()
- dingo.gw.noise.utils.get_time_segments(settings)
Creates a dictionary storing time segments used for estimating PSDs :param settings: Settings that determine the segments :type settings: dict
- Return type:
Dictionary containing the time segments for each detector
- dingo.gw.noise.utils.merge_datasets(asd_dataset_list)
Merges a list of asd datasets into ont :param asd_dataset_list: :type asd_dataset_list: List of ASDDatasets to be merged
- Return type:
A single combined ASDDataset object
- dingo.gw.noise.utils.merge_datasets_cli()
Command-line function to combine a collection of datasets into one. Used for parallelized ASD dataset generation.
- dingo.gw.noise.utils.psd_data_path(data_dir, run, detector)
Return the directory where the PSD data is to be stored :param data_dir: Path to the directory where the PSD dataset will be stored :type data_dir: str :param run: Observing run that is used for the PSD dataset generation :type run: str :param detector: Detector that is used for the PSD dataset generation :type detector: str
- Return type:
the path where the data is stored
Module contents
dingo.gw.training package
Submodules
dingo.gw.training.train_builders module
- dingo.gw.training.train_builders.build_dataset(data_settings)
Build a dataset based on a settings dictionary. This should contain the path of a saved waveform dataset.
This function also truncates the dataset as necessary.
- Parameters:
data_settings (dict) –
- Return type:
- dingo.gw.training.train_builders.build_svd_for_embedding_network(wfd: WaveformDataset, data_settings: dict, asd_dataset_path: str, size: int, num_training_samples: int, num_validation_samples: int, num_workers: int = 0, batch_size: int = 1000, out_dir=None)
Construct SVD matrices V based on clean waveforms in each interferometer. These will be used to seed the weights of the initial projection part of the embedding network.
It first generates a number of training waveforms, and then produces the SVD.
- Parameters:
wfd (WaveformDataset) –
data_settings (dict) –
asd_dataset_path (str) – Training waveforms will be whitened with respect to these ASDs.
size (int) – Number of basis elements to include in the SVD projection.
num_training_samples (int) –
num_validation_samples (int) –
num_workers (int) –
batch_size (int) –
out_dir (str) – SVD performance diagnostics are saved here.
- Returns:
The V matrices for each interferometer. They are ordered as in data_settings[ ‘detectors’].
- Return type:
list of numpy arrays
- dingo.gw.training.train_builders.set_train_transforms(wfd, data_settings, asd_dataset_path, omit_transforms=None)
Set the transform attribute of a waveform dataset based on a settings dictionary. The transform takes waveform polarizations, samples random extrinsic parameters, projects to detectors, adds noise, and formats the data for input to the neural network. It also implements optional GNPE transformations.
Note that the WaveformDataset is modified in-place, so this function returns nothing.
- Parameters:
wfd (WaveformDataset) –
data_settings (dict) –
asd_dataset_path (str) – Path corresponding to the ASD dataset used to generate noise.
omit_transforms – List of sub-transforms to omit from the full composition.
dingo.gw.training.train_pipeline module
- dingo.gw.training.train_pipeline.initialize_stage(pm, wfd, stage, num_workers, resume=False)
- Initializes training based on PosteriorModel metadata and current stage:
Builds transforms (based on noise settings for current stage);
Builds DataLoaders;
At the beginning of a stage (i.e., if not resuming mid-stage), initializes
a new optimizer and scheduler; * Freezes / unfreezes SVD layer of embedding network
- Parameters:
pm (PosteriorModel) –
wfd (WaveformDataset) –
stage (dict) – Settings specific to current stage of training
num_workers (int) –
resume (bool) – Whether training is resuming mid-stage. This controls whether the optimizer and scheduler should be re-initialized based on contents of stage dict.
- Return type:
(train_loader, test_loader)
- dingo.gw.training.train_pipeline.parse_args()
- dingo.gw.training.train_pipeline.prepare_training_new(train_settings: dict, train_dir: str, local_settings: dict)
Based on a settings dictionary, initialize a WaveformDataset and PosteriorModel.
For model type ‘nsf+embedding’ (the only acceptable type at this point) this also initializes the embedding network projection stage with SVD V matrices based on clean detector waveforms.
- Parameters:
train_settings (dict) – Settings which ultimately come from train_settings.yaml file.
train_dir (str) – This is only used to save diagnostics from the SVD.
local_settings (dict) – Local settings (e.g., num_workers, device)
- Return type:
- dingo.gw.training.train_pipeline.prepare_training_resume(checkpoint_name, local_settings, train_dir)
Loads a PosteriorModel from a checkpoint, as well as the corresponding WaveformDataset, in order to continue training. It initializes the saved optimizer and scheduler from the checkpoint.
- Parameters:
checkpoint_name (str) – File name containing the checkpoint (.pt format).
device (str) – ‘cuda’ or ‘cpu’
- Return type:
- dingo.gw.training.train_pipeline.train_local()
- dingo.gw.training.train_pipeline.train_stages(pm, wfd, train_dir, local_settings)
Train the network, iterating through the sequence of stages. Stages can change certain settings such as the noise characteristics, optimizer, and scheduler settings.
- Parameters:
pm (PosteriorModel) –
wfd (WaveformDataset) –
train_dir (str) – Directory for saving checkpoints and train history.
local_settings (dict) –
- Returns:
True if all stages are complete False otherwise
- Return type:
bool
dingo.gw.training.train_pipeline_condor module
- dingo.gw.training.train_pipeline_condor.copy_logfiles(log_dir, epoch, name='info', suffixes=('.err', '.log', '.out'))
- dingo.gw.training.train_pipeline_condor.copyfile(src, dst)
- dingo.gw.training.train_pipeline_condor.create_submission_file(train_dir, condor_settings, filename='submission_file.sub')
TODO: documentation :param train_dir: :param filename: :return:
- dingo.gw.training.train_pipeline_condor.train_condor()
dingo.gw.training.utils module
- dingo.gw.training.utils.append_stage()
Module contents
dingo.gw.transforms package
Submodules
dingo.gw.transforms.detector_transforms module
- class dingo.gw.transforms.detector_transforms.ApplyCalibrationUncertainty(ifo_list, data_domain, calibration_envelope, num_calibration_curves, num_calibration_nodes)
Bases:
object
Expand out a waveform using several detector calibration draws. These multiple draws are intended to be used for marginalizing over calibration uncertainty.
Detector calibration uncertainty is modeled as described in https://dcc.ligo.org/LIGO-T1400682/public
Gravitational wave data \(d\) is assumed to be of the form
\[d(f) = h_{obs}(f) + n(f),\]where \(h_{obs}\) is the observed waveform and \(n\) is the noise. Since the detector is not perfectly calibrated, the observed waveform is not identical to the true waveform \(h(f)\). Rather, it is assumed to have corrections of the form
\[h_{obs}(f) = h(f) * (1 + \delta A(f)) * \exp(i \delta \phi(f)),\]where \(\delta A(f)\) and \(\delta \phi(f)\) are frequency-dependent amplitude and phase errors. Under the calibration model, these are parametrized with cubic splines, defined in terms of calibration parameters \(A_i\) and \(\phi_i\), defined at log-spaced frequency nodes,
\[\begin{split} \delta A(f) &= \mathrm{spline}(f; {f_i, \delta A_i}), \\ \delta \phi(f) &= \mathrm{spline}(f; {f_i, \delta \phi_i}). \end{split}\]The calibration parameters are not known precisely, rather they are assumed to be normally distributed, with mean 0 and standard deviation determined by the “calibration envelope”, which varies from event to event.
For each detector waveform, this transform draws a collection of \(N\) calibration curves \(\{(\delta A^n(f), \delta \phi^n(f))\}_{n=1}^N\) according to a calibration envelope, and applies them to generate \(N\) observed waveforms \(\{h^n_{ obs}(f)\}\). This is intended to be used for marginalizing over the calibration uncertainty when evaluating the likelihood for importance sampling.
- Parameters:
ifo_list (InterferometerList) – List of Interferometers present in the analysis.
data_domain (Domain) – Domain on which data is defined.
calibration_envelope (dict) – Dictionary of the form
{"H1": filepath, "L1": filepath}
, where the filepaths are strings pointing to “.txt” files containing calibration envelopes. The calibration envelope depends on the event analyzed, and therefore remains fixed for all applications of the transform. The calibration envelope is used to define the variances \((\sigma_{\delta A_i}, \sigma_{\delta \phi_i})\) of the calibration paramters.num_calibration_curves (int) – Number of calibration curves \(N\) to produce and apply to the waveform. Ultimately, this will translate to the number of samples in the Monte Carlo estimate of the marginalized likelihood integral.
num_calibration_nodes (int) – Number of log-spaced frequency nodes \(f_i\) to use in defining the spline.
- class dingo.gw.transforms.detector_transforms.GetDetectorTimes(ifo_list, ref_time)
Bases:
object
Compute the time shifts in the individual detectors based on the sky position (ra, dec), the geocent_time and the ref_time.
- class dingo.gw.transforms.detector_transforms.ProjectOntoDetectors(ifo_list, domain, ref_time)
Bases:
object
Project the GW polarizations onto the detectors in ifo_list. This does not sample any new parameters, but relies on the parameters provided in sample[‘extrinsic_parameters’]. Specifically, this transform applies the following operations:
Rescale polarizations to account for sampled luminosity distance
Project polarizations onto the antenna patterns using the ref_time and the extrinsic parameters (ra, dec, psi)
Time shift the strains in the individual detectors according to the times <ifo.name>_time provided in the extrinsic parameters.
- class dingo.gw.transforms.detector_transforms.TimeShiftStrain(ifo_list, domain)
Bases:
object
Time shift the strains in the individual detectors according to the times <ifo.name>_time provided in the extrinsic parameters.
- dingo.gw.transforms.detector_transforms.time_delay_from_geocenter(ifo: Interferometer, ra: float | ndarray | Tensor, dec: float | ndarray | Tensor, time: float)
Calculate time delay between ifo and geocenter. Identical to method ifo.time_delay_from_geocenter(ra, dec, time), but the present implementation allows for batched computation, i.e., it also accepts arrays and tensors for ra and dec.
Implementation analogous to bilby-cython implementation https://git.ligo.org/colm.talbot/bilby-cython/-/blob/main/bilby_cython/geometry.pyx, which is in turn based on XLALArrivaTimeDiff in TimeDelay.c.
- Parameters:
ifo (bilby.gw.detector.interferometer.Interferometer) – bilby interferometer object.
ra (Union[float, np.array, torch.Tensor]) – Right ascension of the source in radians. Either float, or float array/tensor.
dec (Union[float, np.array, torch.Tensor]) – Declination of the source in radians. Either float, or float array/tensor.
time (float) – GPS time in the geocentric frame.
- Returns:
float
- Return type:
Time delay between the two detectors in the geocentric frame
dingo.gw.transforms.general_transforms module
- class dingo.gw.transforms.general_transforms.UnpackDict(selected_keys)
Bases:
object
Unpacks the dictionary to prepare it for final output of the dataloader. Only returns elements specified in selected_keys.
dingo.gw.transforms.gnpe_transforms module
- class dingo.gw.transforms.gnpe_transforms.GNPEBase(kernel_dict, operators)
Bases:
ABC
A base class for Group Equivariant Neural Posterior Estimation [1].
This implements GNPE for approximate equivariances. For exact equivariances, additional processing should be implemented within a subclass.
[1]: https://arxiv.org/abs/2111.13139
- inverse(a, k)
- multiply(a, b, k)
- perturb(g, k)
Generate proxy variables based on initial parameter values.
- Parameters:
g (Union[np.float64, float, torch.Tensor]) – Initial parameter values
k (str) – Parameter name. This is used to identify the group binary operator.
- Return type:
Proxy variables in the same format as g.
- sample_proxies(input_parameters)
Given input parameters, perturbs based on the kernel to produce “proxy” (“hatted”) parameters, i.e., samples
hat g ~ p(hat g | g).
Typically the GNPE NDE will be conditioned on hat g. Furthermore, these proxy parameters will be used to transform the data to simplify it.
Parameters:
- input_parametersdict
Initial parameter values to be perturbed. dict values can be either floats (for training) or torch Tensors (for inference).
- rtype:
A dict of proxy parameters.
- class dingo.gw.transforms.gnpe_transforms.GNPECoalescenceTimes(ifo_list, kernel, exact_global_equivariance=True, inference=False)
Bases:
GNPEBase
GNPE [1] Transformation for detector coalescence times.
For each of the detector coalescence times, a proxy is generated by adding a perturbation epsilon from the GNPE kernel to the true detector time. This proxy is subtracted from the detector time, such that the overall time shift only amounts to -epsilon in training. This standardizes the input data to the inference network, since the applied time shifts are always restricted to the range of the kernel.
To preserve information at inference time, conditioning of the inference network on the proxies is required. To that end, the proxies are stored in sample[ ‘gnpe_proxies’].
We can enforce an exact equivariance under global time translations, by subtracting one proxy (by convention: the first one, usually for H1 ifo) from all other proxies, and from the geocent time, see [1]. This is enabled with the flag exact_global_equivariance.
Note that this transform does not modify the data itself. It only determines the amount by which to time-shift the data.
[1]: arxiv.org/abs/2111.13139
- Parameters:
ifo_list (bilby.gw.detector.InterferometerList) – List of interferometers.
kernel (str) – Defines a Bilby prior, to be used for all interferometers.
exact_global_equivariance (bool = True) – Whether to impose the exact global time translation symmetry.
inference (bool = False) – Whether to use inference or training mode.
dingo.gw.transforms.inference_transforms module
- class dingo.gw.transforms.inference_transforms.CopyToExtrinsicParameters(*parameter_list)
Bases:
object
Copy parameters specified in self.parameter_list from sample[“parameters”] to sample[“extrinsic_parameters”].
- class dingo.gw.transforms.inference_transforms.ExpandStrain(num_samples)
Bases:
object
Expand the waveform of sample by adding a batch axis and copying the waveform num_samples times along this new axis. This is useful for generating num_samples samples at inference time.
- class dingo.gw.transforms.inference_transforms.PostCorrectGeocentTime(inverse=False)
Bases:
object
Post correction for geocent time: add GNPE proxy (only necessary if exact equivariance is enforced)
- class dingo.gw.transforms.inference_transforms.ResetSample(extrinsic_parameters_keys=None)
Bases:
object
- Resets sample:
waveform was potentially modified by gnpe transforms, so reset to waveform_
optionally remove all non-required extrinsic parameters
- class dingo.gw.transforms.inference_transforms.ToTorch(device='cpu')
Bases:
object
Convert all numpy arrays sample to torch tensors and push them to the specified device. All items of sample that are not numpy arrays (e.g., dicts of arrays) remain unchanged.
dingo.gw.transforms.noise_transforms module
- class dingo.gw.transforms.noise_transforms.AddWhiteNoiseComplex
Bases:
object
Adds white noise with a standard deviation determined by self.scale to the complex strain data.
- class dingo.gw.transforms.noise_transforms.RepackageStrainsAndASDS(ifos, first_index=0)
Bases:
object
Repackage the strains and the asds into an [num_ifos, 3, num_bins] dimensional tensor. Order of ifos is provided by self.ifos. By convention, [:,i,:] is used for:
i = 0: strain.real i = 1: strain.imag i = 2: 1 / (asd * 1e23)
- class dingo.gw.transforms.noise_transforms.SampleNoiseASD(asd_dataset)
Bases:
object
Sample a random asds for each detector and add them to sample[‘asds’].
- class dingo.gw.transforms.noise_transforms.WhitenAndScaleStrain(scale_factor)
Bases:
object
Whiten the strain data by dividing w.r.t. the corresponding asds, and scale it with 1/scale_factor.
In uniform frequency domain the scale factor should be np.sqrt(window_factor) / np.sqrt(4.0 * delta_f). It has two purposes:
(*) the denominator accounts for frequency binning (*) dividing by window factor accounts for windowing of strain data
- class dingo.gw.transforms.noise_transforms.WhitenFixedASD(domain: FrequencyDomain, asd_file: str | None = None, inverse: bool = False, precision=None)
Bases:
object
Whiten frequency-series data according to an ASD specified in a file. This uses the ASD files contained in Bilby.
- Parameters:
domain (FrequencyDomain) – ASD is interpolated to the associated frequency grid.
asd_file (str) – Name of the ASD file. If None, use the aligo ASD. [Default: None]
inverse (bool) – Whether to apply the inverse whitening transform, to un-whiten data. [Default: False]
precision (str ("single", "double")) – If not None, sets precision of ASD to specified precision.
- class dingo.gw.transforms.noise_transforms.WhitenStrain
Bases:
object
Whiten the strain data by dividing w.r.t. the corresponding asds.
dingo.gw.transforms.parameter_transforms module
- class dingo.gw.transforms.parameter_transforms.SampleExtrinsicParameters(extrinsic_prior_dict)
Bases:
object
Sample extrinsic parameters and add them to sample in a separate dictionary.
- property reproduction_dict
- class dingo.gw.transforms.parameter_transforms.SelectStandardizeRepackageParameters(parameters_dict, standardization_dict, inverse=False, as_type=None, device='cpu')
Bases:
object
This transformation selects the parameters in standardization_dict, normalizes them by setting p = (p - mean) / std, and repackages the selected parameters to a numpy array.
- as_type: str = None
only applies, if self.inverse == True * if None, data type is kept * if ‘dict’, dict with * if ‘pandas’, use pandas.DataFrame
- class dingo.gw.transforms.parameter_transforms.StandardizeParameters(mu, std)
Bases:
object
Standardize parameters according to the transform (x - mu) / std.
Initialize the standardization transform with means and standard deviations for each parameter
- Parameters:
mu (Dict[str, float]) – The (estimated) means
std (Dict[str, float]) – The (estimated) standard deviations
- inverse(samples)
De-standardize the parameter array according to the specified means and standard deviations.
- Parameters:
samples (Dict[Dict, Dict]) – A nested dictionary with keys ‘parameters’, ‘waveform’, ‘noise_summary’.
mu (Only parameters included in) –
transformed. (std get) –
Module contents
dingo.gw.waveform_generator package
Submodules
dingo.gw.waveform_generator.frame_utils module
These functions are used for transforming between J and L0 frames.
- dingo.gw.waveform_generator.frame_utils.convert_J_to_L0_frame(hlm_J, p, wfg, spin_conversion_phase=None)
- dingo.gw.waveform_generator.frame_utils.get_JL0_euler_angles(p, wfg, spin_conversion_phase=None)
- dingo.gw.waveform_generator.frame_utils.rotate_y(angle, vx, vy, vz)
- dingo.gw.waveform_generator.frame_utils.rotate_z(angle, vx, vy, vz)
dingo.gw.waveform_generator.waveform_generator module
- class dingo.gw.waveform_generator.waveform_generator.NewInterfaceWaveformGenerator(**kwargs)
Bases:
WaveformGenerator
Generate polarizations using GWSignal routines in the specified domain for a single GW coalescence given a set of waveform parameters.
- Parameters:
approximant (str) – Waveform “approximant” string understood by lalsimulation This is defines which waveform model is used.
domain (Domain) – Domain object that specifies on which physical domain the waveform polarizations will be generated, e.g. Fourier domain, time domain.
f_ref (float) – Reference frequency for the waveforms
f_start (float) – Starting frequency for waveform generation. This is optional, and if not included, the starting frequency will be set to f_min. This exists so that EOB waveforms can be generated starting from a lower frequency than f_min.
mode_list (List[Tuple]) – A list of waveform (ell, m) modes to include when generating the polarizations.
spin_conversion_phase (float = None) – Value for phiRef when computing cartesian spins from bilby spins via bilby_to_lalsimulation_spins. The common convention is to use the value of the phase parameter here, which is also used in the spherical harmonics when combining the different modes. If spin_conversion_phase = None, this default behavior is adapted. For dingo, this convention for the phase parameter makes it impossible to treat the phase as an extrinsic parameter, since we can only account for the change of phase in the spherical harmonics when changing the phase (in order to also change the cartesian spins – specifically, to rotate the spins by phase in the sx-sy plane – one would need to recompute the modes, which is expensive). By setting spin_conversion_phase != None, we impose the convention to always use phase = spin_conversion_phase when computing the cartesian spins.
- generate_FD_modes_LO(parameters)
Generate FD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_fd (dict) – Dictionary with (l,m) as keys and the corresponding FD modes in lal format as values.
iota (float)
- generate_FD_waveform(parameters_gwsignal: Dict) Dict[str, ndarray]
Generate Fourier domain GW polarizations (h_plus, h_cross).
- Parameters:
parameters_lal – A tuple of parameters for the lalsimulation waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_TD_modes_L0(parameters)
Generate TD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_td (dict) – Dictionary with (l,m) as keys and the corresponding TD modes in lal format as values.
iota (float)
- generate_TD_modes_L0_conditioned_extra_time(parameters)
Generate TD modes in the L0 frame applying a conditioning routine which mimics the behaviour of the standard LALSimulation conditioning (https://lscsoft.docs.ligo.org/lalsuite/lalsimulation/_l_a_l_sim_inspiral_generator_conditioning_8c.html#ac78b5fcdabf8922a3ac479da20185c85)
Essentially, a new starting frequency is computed to have some extra cycles that will be tapered. Some extra buffer time is also added to ensure that the waveform at the requested starting frequency is not modified, while still having a tapered timeseries suited for clean FFT.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see self.generate_hplus_hcross.
- Returns:
hlm_td (dict) – Dictionary with (l,m) as keys and the corresponding TD modes in lal format as values.
iota (float)
- generate_TD_waveform(parameters_gwsignal: Dict) Dict[str, ndarray]
Generate time domain GW polarizations (h_plus, h_cross)
- Parameters:
parameters_gwsignal – A dict of parameters for the gwsignal waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_hplus_hcross_m(parameters: Dict[str, float]) Dict[tuple, Dict[str, ndarray]]
Generate GW polarizations (h_plus, h_cross), separated into contributions from the different modes. This method is identical to self.generate_hplus_hcross, except that it generates the individual contributions of the modes to the polarizations and sorts these according to their transformation behavior (see below), instead of returning the overall sum.
This is useful in order to treat the phase as an extrinsic parameter. Instead of {“h_plus”: hp, “h_cross”: hc}, this method returns a dict in the form of {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]}. Each key m contains the contribution to the polarization that transforms according to exp(-1j * m * phase) under phase transformations (due to the spherical harmonics).
- Note:
pol_m[m] contains contributions of the m modes and and the -m modes. This is because the frequency domain (FD) modes have a positive frequency part which transforms as exp(-1j * m * phase), while the negative frequency part transforms as exp(+1j * m * phase). Typically, one of these dominates [e.g., the (2,2) mode is dominated by the negative frequency part and the (-2,2) mode is dominated by the positive frequency part] such that the sum of (l,|m|) and (l,-|m|) modes transforms approximately as exp(1j * |m| * phase), which is e.g. used for phase marginalization in bilby/lalinference. However, this is not exact. In this method we account for this effect, such that each contribution pol_m[m] transforms exactly as exp(-1j * m * phase).
Phase shifts contribute in two ways: Firstly via the spherical harmonics, which we account for with the exp(-1j * m * phase) transformation. Secondly, the phase determines how the PE spins transform to cartesian spins, by rotating (sx,sy) by phase. This is not accounted for in this function. Instead, the phase for computing the cartesian spins is fixed to self.spin_conversion_phase (if not None). This effectively changes the PE parameters {phi_jl, phi_12} to parameters {phi_jl_prime, phi_12_prime}. For parameter estimation, a postprocessing operation can be applied to account for this, {phi_jl_prime, phi_12_prime} -> {phi_jl, phi_12}. See also documentation of __init__ method for more information on self.spin_conversion_phase.
Differences to self.generate_hplus_hcross: - We don’t catch errors yet TODO - We don’t apply transforms yet TODO
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
pol_m – Dictionary with contributions to h_plus and h_cross, sorted by their transformation behaviour under phase shifts: {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]} Each contribution h_m transforms as exp(-1j * m * phase) under phase shifts (for fixed self.spin_conversion_phase, see above).
- Return type:
dict
- dingo.gw.waveform_generator.waveform_generator.SEOBNRv4PHM_maximum_starting_frequency(total_mass: float, fudge: float = 0.99) float
Given a total mass return the largest possible starting frequency allowed for SEOBNRv4PHM and similar effective-one-body models.
The intended use for this function is at the stage of designing a data set: after choosing a mass prior one can use it to figure out which prior samples would run into an issue when generating an EOB waveform, and tweak the parameters to reduce the number of failing configurations.
- Parameters:
total_mass – Total mass in units of solar masses
fudge – A fudge factor
- Returns:
The largest possible starting frequency in Hz
- Return type:
f_max_Hz
- class dingo.gw.waveform_generator.waveform_generator.WaveformGenerator(approximant: str, domain: Domain, f_ref: float, f_start: float | None = None, mode_list: List[Tuple] | None = None, transform=None, spin_conversion_phase=None, **kwargs)
Bases:
object
Generate polarizations using LALSimulation routines in the specified domain for a single GW coalescence given a set of waveform parameters.
- Parameters:
approximant (str) – Waveform “approximant” string understood by lalsimulation This is defines which waveform model is used.
domain (Domain) – Domain object that specifies on which physical domain the waveform polarizations will be generated, e.g. Fourier domain, time domain.
f_ref (float) – Reference frequency for the waveforms
f_start (float) – Starting frequency for waveform generation. This is optional, and if not included, the starting frequency will be set to f_min. This exists so that EOB waveforms can be generated starting from a lower frequency than f_min.
mode_list (List[Tuple]) – A list of waveform (ell, m) modes to include when generating the polarizations.
spin_conversion_phase (float = None) – Value for phiRef when computing cartesian spins from bilby spins via bilby_to_lalsimulation_spins. The common convention is to use the value of the phase parameter here, which is also used in the spherical harmonics when combining the different modes. If spin_conversion_phase = None, this default behavior is adapted. For dingo, this convention for the phase parameter makes it impossible to treat the phase as an extrinsic parameter, since we can only account for the change of phase in the spherical harmonics when changing the phase (in order to also change the cartesian spins – specifically, to rotate the spins by phase in the sx-sy plane – one would need to recompute the modes, which is expensive). By setting spin_conversion_phase != None, we impose the convention to always use phase = spin_conversion_phase when computing the cartesian spins.
- generate_FD_modes_LO(parameters)
Generate FD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_fd (dict) – Dictionary with (l,m) as keys and the corresponding FD modes in lal format as values.
iota (float)
- generate_FD_waveform(parameters_lal: Tuple) Dict[str, ndarray]
Generate Fourier domain GW polarizations (h_plus, h_cross).
- Parameters:
parameters_lal – A tuple of parameters for the lalsimulation waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_TD_modes_L0(parameters)
Generate TD modes in the L0 frame.
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
hlm_td (dict) – Dictionary with (l,m) as keys and the corresponding TD modes in lal format as values.
iota (float)
- generate_TD_waveform(parameters_lal: Tuple) Dict[str, ndarray]
Generate time domain GW polarizations (h_plus, h_cross)
- Parameters:
parameters_lal – A tuple of parameters for the lalsimulation waveform generator
- Returns:
A dictionary of generated waveform polarizations
- Return type:
pol_dict
- generate_hplus_hcross(parameters: Dict[str, float], catch_waveform_errors=True) Dict[str, ndarray]
Generate GW polarizations (h_plus, h_cross).
If the generation of the lalsimulation waveform fails with an “Input domain error”, we return NaN polarizations.
Use the domain, approximant, and mode_list specified in the constructor along with the waveform parameters to generate the waveform polarizations.
- Parameters:
parameters (Dict[str, float]) –
A dictionary of parameter names and scalar values. The parameter dictionary must include the following keys. For masses, spins, and distance there are multiple options.
- Mass: (mass_1, mass_2) or a pair of quantities from
((chirp_mass, total_mass), (mass_ratio, symmetric_mass_ratio))
- Spin:
(a_1, a_2, tilt_1, tilt_2, phi_12, phi_jl) if precessing binary or (chi_1, chi_2) if the binary has aligned spins
Reference frequency: f_ref at which spin vectors are defined Extrinsic:
Distance: one of (luminosity_distance, redshift, comoving_distance) Inclination: theta_jn Reference phase: phase Geocentric time: geocent_time (GPS time)
- The following parameters are not required:
Sky location: ra, dec, Polarization angle: psi
- Units:
Masses should be given in units of solar masses. Distance should be given in megaparsecs (Mpc). Frequencies should be given in Hz and time in seconds. Spins should be dimensionless. Angles should be in radians.
catch_waveform_errors (bool) – Whether to catch lalsimulation errors
- Returns:
A dictionary of generated waveform polarizations
- Return type:
wf_dict
- generate_hplus_hcross_m(parameters: Dict[str, float]) Dict[tuple, Dict[str, ndarray]]
Generate GW polarizations (h_plus, h_cross), separated into contributions from the different modes. This method is identical to self.generate_hplus_hcross, except that it generates the individual contributions of the modes to the polarizations and sorts these according to their transformation behavior (see below), instead of returning the overall sum.
This is useful in order to treat the phase as an extrinsic parameter. Instead of {“h_plus”: hp, “h_cross”: hc}, this method returns a dict in the form of {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]}. Each key m contains the contribution to the polarization that transforms according to exp(-1j * m * phase) under phase transformations (due to the spherical harmonics).
- Note:
pol_m[m] contains contributions of the m modes and and the -m modes. This is because the frequency domain (FD) modes have a positive frequency part which transforms as exp(-1j * m * phase), while the negative frequency part transforms as exp(+1j * m * phase). Typically, one of these dominates [e.g., the (2,2) mode is dominated by the negative frequency part and the (-2,2) mode is dominated by the positive frequency part] such that the sum of (l,|m|) and (l,-|m|) modes transforms approximately as exp(1j * |m| * phase), which is e.g. used for phase marginalization in bilby/lalinference. However, this is not exact. In this method we account for this effect, such that each contribution pol_m[m] transforms exactly as exp(-1j * m * phase).
Phase shifts contribute in two ways: Firstly via the spherical harmonics, which we account for with the exp(-1j * m * phase) transformation. Secondly, the phase determines how the PE spins transform to cartesian spins, by rotating (sx,sy) by phase. This is not accounted for in this function. Instead, the phase for computing the cartesian spins is fixed to self.spin_conversion_phase (if not None). This effectively changes the PE parameters {phi_jl, phi_12} to parameters {phi_jl_prime, phi_12_prime}. For parameter estimation, a postprocessing operation can be applied to account for this, {phi_jl_prime, phi_12_prime} -> {phi_jl, phi_12}. See also documentation of __init__ method for more information on self.spin_conversion_phase.
Differences to self.generate_hplus_hcross: - We don’t catch errors yet TODO - We don’t apply transforms yet TODO
- Parameters:
parameters (dict) – Dictionary of parameters for the waveform. For details see see self.generate_hplus_hcross.
- Returns:
pol_m – Dictionary with contributions to h_plus and h_cross, sorted by their transformation behaviour under phase shifts: {m: {“h_plus”: hp_m, “h_cross”: hc_m} for m in [-l_max,…,0,…,l_max]} Each contribution h_m transforms as exp(-1j * m * phase) under phase shifts (for fixed self.spin_conversion_phase, see above).
- Return type:
dict
- setup_mode_array(mode_list: List[Tuple]) Dict
Define a mode array to select waveform modes to include in the polarizations from a list of modes.
- Parameters:
mode_list (a list of (ell, m) modes) –
- Returns:
A lal parameter dictionary
- Return type:
lal_params
- property spin_conversion_phase
- dingo.gw.waveform_generator.waveform_generator.generate_waveforms_parallel(waveform_generator: WaveformGenerator, parameter_samples: DataFrame, pool: Pool | None = None) Dict[str, ndarray]
Generate a waveform dataset, optionally in parallel.
- Parameters:
waveform_generator (WaveformGenerator) – A WaveformGenerator instance
parameter_samples (pd.DataFrame) – Intrinsic parameter samples
pool (multiprocessing.Pool) – Optional pool of workers for parallel generation
- Returns:
A dictionary of all generated polarizations stacked together
- Return type:
polarizations
- dingo.gw.waveform_generator.waveform_generator.generate_waveforms_task_func(args: Tuple, waveform_generator: WaveformGenerator) Dict[str, ndarray]
Picklable wrapper function for parallel waveform generation.
- Parameters:
args – A tuple of (index, pandas.core.series.Series)
waveform_generator – A WaveformGenerator instance
- Return type:
The generated waveform polarization dictionary
- dingo.gw.waveform_generator.waveform_generator.sum_contributions_m(x_m, phase_shift=0.0)
Sum the contributions over m-components, optionally introducing a phase shift.
dingo.gw.waveform_generator.wfg_utils module
- dingo.gw.waveform_generator.wfg_utils.get_polarizations_from_fd_modes_m(hlm_fd, iota, phase)
- dingo.gw.waveform_generator.wfg_utils.get_starting_frequency_for_SEOBRNRv5_conditioning(parameters)
Compute starting frequency needed for having 3 extra cycles for tapering the TD modes. It returns the needed quantities to apply the standard LALSimulation conditioning routines to the TD modes.
- Parameters:
parameters (dict) – Dictionary of parameters suited for GWSignal (obtained with NewInterfaceWaveformGenerator._convert_parameters)
- Returns:
f_min (float) – Waveform starting frequency
f_start (float) – New waveform starting frequency
extra_time (float) – Extra time to take care of situations where the frequency is close to merger
original_f_min (float) – Initial waveform starting frequency
f_isco (float) – ISCO frequency
- dingo.gw.waveform_generator.wfg_utils.get_tapering_window_for_complex_time_series(h, tapering_flag: int = 1)
Get window for tapering of a complex time series from the lal backend. This is done by tapering the time series with lal, and dividing tapered output by untapered input. lal does not support tapering of complex time series objects, so as a workaround we taper only the real part of the array and extract the window based on this.
- Parameters:
h – complex lal time series object
tapering_flag (int = 1) –
- Flag for tapering. See e.g. lines 2773-2777 in
https://lscsoft.docs.ligo.org/lalsuite/lalsimulation/ _l_a_l_sim_inspiral_waveform_taper_8c_source.html#l00222
tapering_flag = 1 corresponds to LAL_SIM_INSPIRAL_TAPER_START
- Returns:
window – Array of length h.data.length, with the window used for tapering.
- Return type:
np.ndarray
- dingo.gw.waveform_generator.wfg_utils.linked_list_modes_to_dict_modes(hlm_ll)
Convert linked list of modes into dictionary with keys (l,m).
- dingo.gw.waveform_generator.wfg_utils.taper_td_modes_for_SEOBRNRv5_extra_time(h, extra_time, f_min, original_f_min, f_isco)
Apply standard tapering procedure mimicking LALSimulation routine (https://lscsoft.docs.ligo.org/lalsuite/lalsimulation/_l_a_l_sim_inspiral_generator_conditioning_8c.html#ac78b5fcdabf8922a3ac479da20185c85)
- Parameters:
h – complex gwpy TimeSeries object
extra_time (float) – Extra time to take care of situations where the frequency is close to merger
f_min (float) – Starting frequency employed in waveform generation
original_f_min (float) – Initial starting frequency requested by the user
f_isco – ISCO frequency
- Returns:
complex lal timeseries object
- Return type:
h_return
- dingo.gw.waveform_generator.wfg_utils.taper_td_modes_in_place(hlm_td, tapering_flag: int = 1)
Taper the time domain modes in place.
- Parameters:
hlm_td (dict) – Dictionary with (l,m) keys and the complex lal time series objects for the corresponding modes.
tapering_flag (int = 1) –
- Flag for tapering. See e.g. lines 2773-2777 in
https://lscsoft.docs.ligo.org/lalsuite/lalsimulation/ _l_a_l_sim_inspiral_waveform_taper_8c_source.html#l00222
tapering_flag = 1 corresponds to LAL_SIM_INSPIRAL_TAPER_START
- dingo.gw.waveform_generator.wfg_utils.td_modes_to_fd_modes(hlm_td, domain)
Transform dict of td modes to dict of fd modes via FFT. The td modes are expected to be tapered.
- Parameters:
hlm_td (dict) – Dictionary with (l,m) keys and the complex lal time series objects for the corresponding tapered modes.
domain (dingo.gw.domains.FrequencyDomain) – Target domain after FFT.
- Returns:
hlm_fd – Dictionary with (l,m) keys and numpy arrays with the corresponding modes as values.
- Return type:
dict
Module contents
Submodules
dingo.gw.SVD module
- class dingo.gw.SVD.ApplySVD(svd_basis: SVDBasis, inverse: bool = False)
Bases:
object
Transform operator for applying an SVD compression / decompression.
- Parameters:
svd_basis (SVDBasis) –
inverse (bool) – Whether to apply for the forward (compression) or inverse (decompression) transform. Default: False.
- class dingo.gw.SVD.SVDBasis(file_name=None, dictionary=None)
Bases:
DingoDataset
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- compress(data: ndarray)
Convert from data (e.g., frequency series) to compressed representation in terms of basis coefficients.
- Parameters:
data (np.ndarray) –
- Return type:
array of basis coefficients
- compute_test_mismatches(data: ndarray, parameters: DataFrame | None = None, increment: int = 50, verbose: bool = False)
Test SVD basis by computing mismatches of compressed / decompressed data against original data. Results are saved as a DataFrame.
- Parameters:
data (np.ndarray) – Array of data sets to validate against.
parameters (pd.DataFrame) – Optional labels for the data sets. This is useful for checking performance on particular regions of the parameter space.
increment (int) – Specifies SVD truncations for computing mismatches. E.g., increment = 50 means that the SVD will be truncated at size [50, 100, 150, …, len(data)].
verbose (bool) – Whether to print summary statistics.
- dataset_type = 'svd_basis'
- decompress(coefficients: ndarray)
Convert from basis coefficients back to raw data representation.
- Parameters:
coefficients (np.ndarray) – Array of basis coefficients
- Return type:
array of decompressed data
- from_dictionary(dictionary: dict)
Load the SVD basis from a dictionary.
- Parameters:
dictionary (dict) – The dictionary should contain at least a ‘V’ key, and optionally an ‘s’ key.
- from_file(filename)
Load the SVD basis from a HDF5 file.
- Parameters:
filename (str) –
- generate_basis(training_data: ndarray, n: int, method: str = 'random')
Generate the SVD basis from training data and store it.
The SVD decomposition takes
training_data = U @ diag(s) @ Vh
where U and Vh are unitary.
- Parameters:
training_data (np.ndarray) – Array of waveform data on the physical domain
n (int) – Number of basis elements to keep. n=0 keeps all basis elements.
method (str) – Select SVD method, ‘random’ or ‘scipy’
- print_validation_summary()
Print a summary of the validation mismatches.
dingo.gw.domains module
- class dingo.gw.domains.Domain
Bases:
ABC
Defines the physical domain on which the data of interest live.
This includes a specification of the bins or points, and a few additional properties associated with the data.
- abstract property domain_dict
Enables to rebuild the domain via calling build_domain(domain_dict).
- abstract property duration: float
Waveform duration in seconds.
- abstract property f_max: float
The maximum frequency [Hz] is set to half the sampling rate.
- abstract property max_idx: int
- abstract property min_idx: int
- abstract property noise_std: float
Standard deviation of the whitened noise distribution
- abstract property sampling_rate: float
The sampling rate of the data [Hz].
- abstract time_translate_data(data, dt) ndarray
Time translate strain data by dt seconds.
- abstract update(new_settings: dict)
- class dingo.gw.domains.FrequencyDomain(f_min: float, f_max: float, delta_f: float, window_factor: float | None = None)
Bases:
Domain
Defines the physical domain on which the data of interest live.
The frequency bins are assumed to be uniform between [0, f_max] with spacing delta_f. Given a finite length of time domain data, the Fourier domain data starts at a frequency f_min and is zero below this frequency. window_kwargs specify windowing used for FFT to obtain FD data from TD data in practice.
- static add_phase(data, phase)
Add a (frequency-dependent) phase to a frequency series. Allows for batching, as well as additional channels (such as detectors). Accounts for the fact that the data could be a complex frequency series or real and imaginary parts.
Convention: the phase phi(f) is defined via exp(- 1j * phi(f)).
- Parameters:
data (Union[np.array, torch.Tensor]) –
phase (Union[np.array, torch.Tensor]) –
- Return type:
New array or tensor of the same shape as data.
- property delta_f: float
The frequency spacing of the uniform grid [Hz].
- property domain_dict
Enables to rebuild the domain via calling build_domain(domain_dict).
- property duration: float
Waveform duration in seconds.
- property f_max: float
The maximum frequency [Hz] is typically set to half the sampling rate.
- property f_min: float
The minimum frequency [Hz].
- property frequency_mask: ndarray
Mask which selects frequency bins greater than or equal to the starting frequency
- property frequency_mask_length: int
Number of samples in the subdomain domain[frequency_mask].
- get_sample_frequencies_astype(data)
Returns a 1D frequency array compatible with the last index of data array.
Decides whether array is numpy or torch tensor (and cuda vs cpu), and whether it contains the leading zeros below f_min.
- Parameters:
data (Union[np.array, torch.Tensor]) – Sample data
- Return type:
frequency array compatible with last index
- property max_idx
- property min_idx
- property noise_std: float
Standard deviation of the whitened noise distribution.
To have noise that comes from a multivariate unit normal distribution, you must divide by this factor. In practice, this means dividing the whitened waveforms by this.
TODO: This description makes some assumptions that need to be clarified. Windowing of TD data; tapering window has a slope -> reduces power only for noise, but not for the signal which is in the main part unaffected by the taper
- property sample_frequencies
- property sample_frequencies_torch
- property sample_frequencies_torch_cuda
- property sampling_rate: float
The sampling rate of the data [Hz].
- set_new_range(f_min: float | None = None, f_max: float | None = None)
Set a new range [f_min, f_max] for the domain. This is only allowed if the new range is contained within the old one.
- time_translate_data(data, dt)
Time translate frequency-domain data by dt. Time translation corresponds (in frequency domain) to multiplication by
\[\exp(-2 \pi i \, f \, dt).\]This method allows for multiple batch dimensions. For torch.Tensor data, allow for either a complex or a (real, imag) representation.
- Parameters:
data (array-like (numpy, torch)) –
Shape (B, C, N), where
B corresponds to any dimension >= 0,
C is either absent (for complex data) or has dimension >= 2 (for data represented as real and imaginary parts), and
N is either len(self) or len(self)-self.min_idx (for truncated data).
dt (torch tensor, or scalar (if data is numpy)) – Shape (B)
- Return type:
Array-like of the same form as data.
- update(new_settings: dict)
Update the domain with new settings. This is only allowed if the new settings are “compatible” with the old ones. E.g., f_min should be larger than the existing f_min.
- Parameters:
new_settings (dict) – Settings dictionary. Must contain a subset of the keys contained in domain_dict.
- update_data(data: ndarray, axis: int = -1, low_value: float = 0.0)
Adjusts data to be compatible with the domain:
Below f_min, it sets the data to low_value (typically 0.0 for a waveform, but for a PSD this might be a large value).
Above f_max, it truncates the data array.
- Parameters:
data (np.ndarray) – Data array
axis (int) – Which data axis to apply the adjustment along.
low_value (float) – Below f_min, set the data to this value.
- Returns:
The new data array.
- Return type:
np.ndarray
- property window_factor
- class dingo.gw.domains.PCADomain
Bases:
Domain
TODO
- property noise_std: float
Standard deviation of the whitened noise distribution.
To have noise that comes from a multivariate unit normal distribution, you must divide by this factor. In practice, this means dividing the whitened waveforms by this.
In the continuum limit in time domain, the standard deviation of white noise would at each point go to infinity, hence the delta_t factor.
- class dingo.gw.domains.TimeDomain(time_duration: float, sampling_rate: float)
Bases:
Domain
Defines the physical time domain on which the data of interest live.
The time bins are assumed to be uniform between [0, duration] with spacing 1 / sampling_rate. window_factor is used to compute noise_std().
- property delta_t: float
The size of the time bins
- property domain_dict
Enables to rebuild the domain via calling build_domain(domain_dict).
- property duration: float
Waveform duration in seconds.
- property f_max: float
The maximum frequency [Hz] is typically set to half the sampling rate.
- property max_idx: int
- property min_idx: int
- property noise_std: float
Standard deviation of the whitened noise distribution.
To have noise that comes from a multivariate unit normal distribution, you must divide by this factor. In practice, this means dividing the whitened waveforms by this.
In the continuum limit in time domain, the standard deviation of white noise would at each point go to infinity, hence the delta_t factor.
- property sampling_rate: float
The sampling rate of the data [Hz].
- time_translate_data(data, dt) ndarray
Time translate strain data by dt seconds.
- dingo.gw.domains.build_domain(settings: Dict) Domain
Instantiate a domain class from settings.
- Parameters:
settings (dict) – Dicionary with ‘type’ key denoting the type of domain, and keys corresponding to the kwargs needed to construct the Domain.
- Return type:
A Domain instance of the correct type.
- dingo.gw.domains.build_domain_from_model_metadata(model_metadata) Domain
Instantiate a domain class from settings of model.
- Parameters:
model_metadata (dict) – model metadata containing information to build the domain typically obtained from the model.metadata attribute
- Return type:
A Domain instance of the correct type.
dingo.gw.download_strain_data module
- dingo.gw.download_strain_data.download_event_data_in_FD(detectors, time_event, time_segment, time_buffer, window, num_segments_psd=128)
Download event data in frequency domain. This includes strain data for the event at GPS time t_event as well as the correcponding ASD.
- Parameters:
detectors (list) – list of detectors specified via strings
time_event (float) – GPS time of the event
time_segment (float) – length of the strain segment, in seconds
time_buffer (float) – specifies buffer time after time_event, in seconds
window (Union(np.ndarray, dict)) – Window used for Fourier transforming the event strain data. Either provided directly as np.ndarray, or as dict in which case the window is generated by window = dingo.gw.gwutils.get_window(**window).
num_segments_psd (int = 128) – number of segments used for PSD generation
- dingo.gw.download_strain_data.download_strain_data_in_FD(det, time_event, time_segment, time_buffer, window)
Download strain data for a GW event at GPS time time_event. The segment is time_segment seconds long, including time_buffer seconds after the event. The strain is Fourier transformed, the frequency domain strain is then time shifted by time_buffer, such that the event occurs at t=0.
- Parameters:
det (str) – detector
time_event (float) – GPS time of the event
time_segment (float) – length of the strain segment, in seconds
time_buffer (float) – specifies buffer time after time_event, in seconds
window (Union(np.ndarray, dict)) – Window used for Fourier transforming the event strain data. Either provided directly as np.ndarray, or as dict in which case the window is generated by window = dingo.gw.gwutils.get_window(**window).
- Returns:
event_strain – array with the frequency domain strain
- Return type:
np.array
- dingo.gw.download_strain_data.estimate_single_psd(time_start, time_segment, window, f_s=4096, num_segments: int = 128, det=None, channel=None)
Download strain data and generate a PSD based on these. Use num_segments of length time_segment, starting at GPS time time_start. If no channel is specified, GWOSC is used to download the data.
- Parameters:
time_start (float) – start GPS time for PSD estimation
time_segment (float) – time for a single segment for PSD information, in seconds
window (Union(np.ndarray, dict)) – Window used for PSD generation, needs to be the same as used for Fourier transform of event strain data. Either provided directly as np.ndarray, or as dict in which case the window is generated by window = dingo.gw.gwutils.get_window(**window).
num_segments (int = 256) – number of segments used for PSD generation
det (str) – If provided, will download data from GWOSC using TimeSeries.fetch_open_data(). Mutually exclusive with ‘channel’.
channel (str) – If provided, will download the data from the channel instead of gwosc using TimeSeries.get()
- Returns:
psd – array of psd
- Return type:
np.array
dingo.gw.gwutils module
- dingo.gw.gwutils.get_extrinsic_prior_dict(extrinsic_prior)
Build dict for extrinsic prior by starting with default_extrinsic_dict, and overwriting every element for which extrinsic_prior is not default. TODO: Move to dingo.gw.prior.py?
- dingo.gw.gwutils.get_mismatch(a, b, domain, asd_file=None)
Mistmatch is 1 - overlap, where overlap is defined by inner(a, b) / sqrt(inner(a, a) * inner(b, b)). See e.g. Eq. (44) in https://arxiv.org/pdf/1106.1021.pdf.
- Parameters:
a –
b –
domain –
asd_file –
- dingo.gw.gwutils.get_standardization_dict(extrinsic_prior_dict, wfd, selected_parameters, transform=None)
Calculates the mean and standard deviation of parameters. This is needed for standardizing neural-network input and output.
- Parameters:
extrinsic_prior_dict (dict) –
wfd (WaveformDataset) –
selected_parameters (list[str]) – List of parameters for which to estimate standardization factors.
transform (Transform) – Operator that will generate samples for parameters contained in selected_parameters that are not contained in the intrinsic or extrinsic prior. (E.g., H1_time, L1_time_proxy)
- dingo.gw.gwutils.get_window(window_kwargs)
Compute window from window_kwargs.
- dingo.gw.gwutils.get_window_factor(window)
Compute window factor. If window is not provided as array or tensor but as window_kwargs, first build the window.
dingo.gw.injection module
- class dingo.gw.injection.GWSignal(wfg_kwargs: dict, wfg_domain: FrequencyDomain, data_domain: FrequencyDomain, ifo_list: list, t_ref: float)
Bases:
object
Base class for generating gravitational wave signals in interferometers. Generates waveform polarizations based on provided parameters, and then projects to detectors.
Includes option for whitening the signal based on a provided ASD.
- Parameters:
wfg_kwargs (dict) – Waveform generator parameters [approximant, f_ref, and (optionally) f_start].
wfg_domain (FrequencyDomain) – Domain used for waveform generation. This can potentially deviate from the final domain, having a wider frequency range needed for waveform generation.
data_domain (FrequencyDomain) – Domain object for final signal.
ifo_list (list) – Names of interferometers for projection.
t_ref (float) – Reference time that specifies ifo locations.
- property asd
Amplitude spectral density.
Either a single array, a dict (for individual interferometers), or an ASDDataset, from which random ASDs are drawn.
- property calibration_marginalization_kwargs
Dictionary with the following keys:
- calibration_envelope
Dictionary of the form {“H1”: filepath, “L1”: filepath, …} with locations of lookup tables for the calibration uncertainty curves.
- num_calibration_nodes
Number of nodes for the calibration model.
- num_calibration_curves
Number of calibration curves to use in marginalization.
- signal(theta)
Compute the GW signal for parameters theta.
Step 1: Generate polarizations Step 2: Project polarizations onto detectors; optionally (depending on self.whiten) whiten and scale.
- Parameters:
theta (dict) – Signal parameters. Includes intrinsic parameters to be passed to waveform generator, and extrinsic parameters for detector projection.
- Returns:
- keys:
waveform: GW strain signal for each detector. extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
- signal_m(theta)
Compute the GW signal for parameters theta. Same as self.signal(theta) method, but it does not sum the contributions of the individual modes, and instead returns a dict {m: pol_m for m in [-l_max,…,0,…,l_max]} where each contribution pol_m transforms as exp(-1j * m * phase_shift) under phase shifts.
Step 1: Generate polarizations Step 2: Project polarizations onto detectors;
optionally (depending on self.whiten) whiten and scale.
- Parameters:
theta (dict) – Signal parameters. Includes intrinsic parameters to be passed to waveform generator, and extrinsic parameters for detector projection.
- Returns:
- keys:
- waveform:
GW strain signal for each detector, with individual contributions {m: pol_m for m in [-l_max,…,0,…,l_max]}
extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
- property whiten
Bool specifying whether to whiten (and scale) generated signals.
- class dingo.gw.injection.Injection(prior, **gwsignal_kwargs)
Bases:
GWSignal
Produces injections of signals (with random or specified parameters) into stationary Gaussian noise. Output is not whitened.
- Parameters:
prior (PriorDict) – Prior used for sampling random parameters.
gwsignal_kwargs – Arguments to be passed to GWSignal base class.
- classmethod from_posterior_model_metadata(metadata)
Instantiate an Injection based on a posterior model. The prior, waveform settings, etc., will all be consistent with what the model was trained with.
- Parameters:
metadata (dict) – Dict which you can get via PosteriorModel.metadata
- injection(theta)
Generate an injection based on specified parameters.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Parameters:
theta (dict) – Parameters used for injection.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
- random_injection()
Generate a random injection.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
dingo.gw.likelihood module
- class dingo.gw.likelihood.StationaryGaussianGWLikelihood(wfg_kwargs, wfg_domain, data_domain, event_data, t_ref=None, time_marginalization_kwargs=None, phase_marginalization_kwargs=None, calibration_marginalization_kwargs=None, phase_grid=None)
Bases:
GWSignal
,Likelihood
Implements GW likelihood for stationary, Gaussian noise.
- Parameters:
wfg_kwargs (dict) – Waveform generator parameters (at least approximant and f_ref).
wfg_domain (dingo.gw.domains.Domain) – Domain used for waveform generation. This can potentially deviate from the final domain, having a wider frequency range needed for waveform generation.
data_domain (dingo.gw.domains.Domain) – Domain object for event data.
event_data (dict) – GW data. Contains strain data in event_data[“waveforms”] and asds in event_data[“asds”].
t_ref (float) – Reference time; true geocent time for GW is t_ref + theta[“geocent_time”].
time_marginalization_kwargs (dict) – Time marginalization parameters. If None, no time marginalization is used.
calibration_marginalization_kwargs (dict) – Calibration marginalization parameters. If None, no calibration marginalization is used.
phase_marginalization_kwargs (dict) – Phase marginalization parameters. If None, no phase marginalization is used.
- d_inner_h_complex(theta)
Calculate the complex inner product (d | h(theta)) between the stored data d and a simulated waveform with given parameters theta.
- Parameters:
theta (dict) – Parameters at which to evaluate h.
- Returns:
complex
- Return type:
Inner product
- d_inner_h_complex_multi(theta: DataFrame, num_processes: int = 1) ndarray
Calculate the complex inner product (d | h(theta)) between the stored data d and a simulated waveform with given parameters theta. Works with multiprocessing.
- Parameters:
theta (pd.DataFrame) – Parameters at which to evaluate h.
num_processes (int) – Number of parallel processes to use.
- Returns:
complex
- Return type:
Inner product
- initialize_time_marginalization(t_lower, t_upper, n_fft=1)
Initialize time marginalization. Time marginalization can be performed via FFT, which is super fast. However, this limits the time resolution to delta_t = 1/self.data_domain.f_max. In order to allow for a finer time resolution we compute the time marginalized likelihood n_fft via FFT on a grid of n_fft different time shifts [0, delta_t, 2*delta_t, …, (n_fft-1)*delta_t] and average over the time shifts. The effective time resolution is thus
delta_t_eff = delta_t / n_fft = 1 / (f_max * n_fft).
Note: Time marginalization in only implemented for uniform time priors.
- Parameters:
t_lower (float) – Lower time bound of the uniform time prior.
t_upper (float) – Upper time bound of the uniform time prior.
n_fft (int = 1) – Size of grid for FFT for time marginalization.
- log_likelihood(theta)
- log_likelihood_phase_grid(theta, phases=None)
- dingo.gw.likelihood.build_stationary_gaussian_likelihood(metadata, event_dataset=None, time_marginalization_kwargs=None)
Build a StationaryGaussianLikelihoodBBH object from the metadata.
- Parameters:
metadata (dict) – Metadata from stored dingo parameter samples file. Typially accessed via pd.read_pickle(/path/to/dingo-output.pkl).metadata.
event_dataset (str = None) – Path to event dataset for caching. If None, don’t cache.
time_marginalization_kwargs (dict = None) – Forwarded to the likelihood.
- Returns:
likelihood – likelihood object
- Return type:
- dingo.gw.likelihood.get_wfg(wfg_kwargs, data_domain, frequency_range=None)
Set up waveform generator from wfg_kwargs. The domain of the wfg is primarily determined by the data domain, but a new (larger) frequency range can be specified if this is necessary for the waveforms to be generated successfully (e.g., for EOB waveforms which require a sufficiently small f_min and sufficiently large f_max).
- Parameters:
wfg_kwargs (dict) – Waveform generator parameters.
data_domain (dingo.gw.domains.Domain) – Domain of event data, with bounds determined by likelihood integral.
frequency_range (dict = None) –
Frequency range for waveform generator. If None, that of data domain is used, which corresponds to the bounds of the likelihood integral. Possible keys:
- ’f_start’: float
Frequency at which to start the waveform generation. Overrides f_start in metadata[“model”][“dataset_settings”][“waveform_generator”].
- ’f_end’: float
Frequency at which to start the waveform generation.
- Returns:
wfg – Waveform generator object.
- Return type:
- dingo.gw.likelihood.inner_product(a, b, min_idx=0, delta_f=None, psd=None)
Compute the inner product between two complex arrays. There are two modes: either, the data a and b are not whitened, in which case delta_f and the psd must be provided. Alternatively, if delta_f and psd are not provided, the data a and b are assumed to be whitened already (i.e., whitened as d -> d * sqrt(4 delta_f / psd)).
Note: sum is only taken along axis 0 (which is assumed to be the frequency axis), while other axes are preserved. This is e.g. useful when evaluating kappa2 on a phase grid.
- Parameters:
a (np.ndaarray) – First array with frequency domain data.
b (np.ndaarray) – Second array with frequency domain data.
min_idx (int = 0) – Truncation of likelihood integral, index of lowest frequency bin to consider.
delta_f (float) – Frequency resolution of the data. If None, a and b are assumed to be whitened and the inner product is computed without further whitening.
psd (np.ndarray = None) – PSD of the data. If None, a and b are assumed to be whitened and the inner product is computed without further whitening.
- Returns:
inner_product
- Return type:
float
- dingo.gw.likelihood.inner_product_complex(a, b, min_idx=0, delta_f=None, psd=None)
Same as inner product, but without taking the real part. Retaining phase information is useful for the phase-marginalized likelihood. For further documentation see inner_product function.
- dingo.gw.likelihood.main()
dingo.gw.ls_cli module
- dingo.gw.ls_cli.determine_dataset_type(file_name)
- dingo.gw.ls_cli.ls()
dingo.gw.prior module
- class dingo.gw.prior.BBHExtrinsicPriorDict(dictionary=None, filename=None, aligned_spin=False, conversion_function=None)
Bases:
BBHPriorDict
This class is the same as BBHPriorDict except that it does not require mass parameters.
It also contains a method for estimating the standardization parameters.
- TODO:
Add support for zenith/azimuth
Defaults?
Initialises a Prior set for Binary Black holes
- Parameters:
dictionary (dict, optional) – See superclass
filename (str, optional) – See superclass
conversion_function (func) – Function to convert between sampled parameters and constraints. By default this generates many additional parameters, see BBHPriorDict.default_conversion_function
- default_conversion_function(sample)
Default parameter conversion function for BBH signals.
This generates: - the parameters passed to source.lal_binary_black_hole - all mass parameters
It does not generate: - component spins - source-frame parameters
- Parameters:
sample (dict) – Dictionary to convert
- Returns:
sample – Same as input
- Return type:
dict
- mean_std(keys=[], sample_size=50000, force_numerical=False)
Calculate the mean and standard deviation over the prior.
- Parameters:
keys (list(str)) – A list of desired parameter names
sample_size (int) – For nonanalytic priors, number of samples to use to estimate the result.
force_numerical (bool (False)) – Whether to force a numerical estimation of result, even when analytic results are available (useful for testing)
deviations. (Returns dictionaries for the means and standard) –
TODO (Fix for constrained priors. Shouldn't be an issue for extrinsic parameters.) –
- dingo.gw.prior.build_prior_with_defaults(prior_settings: Dict[str, str])
Generate BBHPriorDict based on dictionary of prior settings, allowing for default values.
- Parameters:
prior_settings (Dict) –
A dictionary containing prior definitions for intrinsic parameters Allowed values for each parameter are:
’default’ to use a default prior
- a string for a custom prior, e.g.,
”Uniform(minimum=10.0, maximum=80.0, name=None, latex_label=None, unit=None, boundary=None)”
a (Depending on the particular prior choices the dimensionality of) –
vary. (parameter sample obtained from the returned GWPriorDict will) –
- dingo.gw.prior.split_off_extrinsic_parameters(theta)
Split theta into intrinsic and extrinsic parameters.
- Parameters:
theta (dict) – BBH parameters. Includes intrinsic parameters to be passed to waveform generator, and extrinsic parameters for detector projection.
- Returns:
theta_intrinsic (dict) – BBH intrinsic parameters.
theta_extrinsic (dict) – BBH extrinsic parameters.
dingo.gw.result module
- class dingo.gw.result.Result(**kwargs)
Bases:
Result
A dataset class to hold a collection of gravitational-wave parameter samples and perform various operations on them.
Compared to the base class, this class implements the domain, prior, and likelihood. It also includes a method for sampling the binary reference phase parameter based on the other parameters and the likelihood.
- Attributes:
- samplespd.Dataframe
Contains parameter samples, as well as (possibly) log_prob, log_likelihood, weights, log_prior, delta_log_prob_target.
- domainDomain
The domain of the data (e.g., FrequencyDomain), needed for calculating likelihoods.
- priorPriorDict
The prior distribution, used for importance sampling.
- likelihoodLikelihood
The Likelihood object, needed for importance sampling.
- contextdict
Context data from which the samples were produced (e.g., strain data, ASDs).
- metadatadict
Metadata inherited from the Sampler object. This describes the neural networks and sampling settings used.
- event_metadatadict
Metadata for the event analyzed, including time, data conditioning, channel, and detector information.
- log_evidencefloat
Calculated log(evidence) after importance sampling.
- log_evidence_stdfloat (property)
Standard deviation of the log(evidence)
- effective_sample_size, n_efffloat (property)
Number of effective samples, (sum_i w_i)^2 / sum_i w_i^2
- sample_efficiencyfloat (property)
Number of effective samples / Number of samples
- synthetic_phase_kwargsdict
kwargs describing the synthetic phase sampling.
For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.
- Parameters:
file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
- property approximant
- property calibration_marginalization_kwargs
- dataset_type = 'gw_result'
- property f_ref
- get_samples_bilby_phase()
Convert the spin angles phi_jl and theta_jn to account for a difference in phase definition compared to Bilby.
- Returns:
Samples
- Return type:
pd.DataFrame
- property interferometers
- property pesummary_prior
The prior in a form suitable for PESummary.
By convention, Dingo stores all times relative to a reference time, typically the trigger time for an event. The prior returned here corrects for that offset to be consistent with other codes.
- property pesummary_samples
Samples in a form suitable for PESummary.
These samples are adjusted to undo certain conventions used internally by Dingo:
Times are corrected by the reference time t_ref.
Samples are unweighted, using a fixed random seed for sampling importance
resampling. * The spin angles phi_jl and theta_jn are transformed to account for a difference in phase definition. * Some columns are dropped: delta_log_prob_target, log_prob
- property phase_marginalization_kwargs
- sample_synthetic_phase(synthetic_phase_kwargs, inverse: bool = False)
Sample a synthetic phase for the waveform. This is a post-processing step applicable to samples theta in the full parameter space, except for the phase parameter (i.e., 14D samples). This step adds a phase parameter to the samples based on likelihood evaluations.
A synthetic phase is sampled as follows.
Compute and cache the modes for the waveform mu(theta, phase=0) for phase 0, organize them such that each contribution m transforms as exp(-i * m * phase).
Compute the likelihood on a phase grid, by computing mu(theta, phase) from the cached modes. In principle this likelihood is exact, however, it can deviate slightly from the likelihood computed without cached modes for various technical reasons (e.g., slightly different windowing of cached modes compared to full waveform when transforming TD waveform to FD). These small deviations can be fully accounted for by importance sampling. Note: when approximation_22_mode=True, the entire waveform is assumed to transform as exp(2i*phase), in which case the likelihood is only exact if the waveform is fully dominated by the (2, 2) mode.
Build a synthetic conditional phase distribution based on this grid. We use an interpolated prior distribution bilby.core.prior.Interped, such that we can sample and also evaluate the log_prob. We add a constant background with weight self.synthetic_phase_kwargs to the kde to make sure that we keep a mass-covering property. With this, the importance sampling will yield exact results even when the synthetic phase conditional is just an approximation.
Besides adding phase samples to self.samples[‘phase’], this method also modifies self.samples[‘log_prob’] by adding the log_prob of the synthetic phase conditional.
This method modifies self.samples in place.
- Parameters:
synthetic_phase_kwargs (dict) –
- This should consist of the kwargs
approximation_22_mode (optional) num_processes (optional) n_grid uniform_weight (optional)
inverse (bool, default False) – Whether to apply instead the inverse transformation. This is used prior to calculating the log_prob. In inverse mode, the posterior probability over phase is calculated for given samples. It is stored in self.samples[ ‘log_prob’].
- property synthetic_phase_kwargs
- property t_ref
- property time_marginalization_kwargs
- update_prior(prior_update)
Update the prior based on a new dict of priors. Use the existing prior for parameters not included in the new dict.
If class samples have not been importance sampled, then save new sample weights based on the new prior. If class samples have been importance sampled, then update the weights.
- Parameters:
prior_update (dict) – Priors to update. This should be of the form {key : prior_str}, where str is a string that can instantiate a prior via PriorDict(prior_update). The prior_update is provided in this form so that it can be properly saved with the Result and later instantiated.
dingo.gw.temporary_debug_utils module
- dingo.gw.temporary_debug_utils.save_training_injection(outname, pm, data, idx=0)
For debugging: extract a training injection. To be used inside train or test loop.
Module contents
dingo.pipe package
Subpackages
dingo.pipe.nodes package
Submodules
dingo.pipe.nodes.generation_node module
- class dingo.pipe.nodes.generation_node.GenerationNode(inputs, importance_sampling=False, **kwargs)
Bases:
GenerationNode
Node for data generation jobs
Parameters:
- inputs: bilby_pipe.main.MainInput
The user-defined inputs
- trigger_time: float
The trigger time to use in generating analysis data
- idx: int
The index of the data-generation job, used to label data products
- dag: bilby_pipe.dag.Dag
The dag structure
- parent: bilby_pipe.job_creation.node.Node (optional)
Any job to set as the parent to this job - used to enforce dependencies
- property event_data_file
- property executable
- property job_name
- setup_arguments(**kwargs)
dingo.pipe.nodes.importance_sampling_node module
dingo.pipe.nodes.merge_node module
dingo.pipe.nodes.pe_summary_node module
- class dingo.pipe.nodes.pe_summary_node.PESummaryNode(inputs, merged_node_list, generation_node_list, dag)
Bases:
PESummaryNode
dingo.pipe.nodes.plot_node module
dingo.pipe.nodes.sampling_node module
Module contents
Submodules
dingo.pipe.dag_creator module
- dingo.pipe.dag_creator.generate_dag(inputs, model_args)
- dingo.pipe.dag_creator.get_parallel_list(inputs)
- dingo.pipe.dag_creator.get_trigger_time_list(inputs)
Returns a list of GPS trigger times for each data segment
dingo.pipe.data_generation module
- class dingo.pipe.data_generation.DataGenerationInput(args, unknown_args, create_data=True)
Bases:
DataGenerationInput
- property event_data_file
- property importance_sampling_updates
- save_hdf5()
Save frequency-domain strain and ASDs as DingoDataset HDF5 format.
- dingo.pipe.data_generation.create_generation_parser()
Data generation parser creation
- dingo.pipe.data_generation.main()
Data generation main logic
dingo.pipe.default_settings module
dingo.pipe.dingo_result module
- dingo.pipe.dingo_result.main()
dingo.pipe.importance_sampling module
Script to importance sample based on Dingo samples. Based on bilby_pipe data analysis script.
- class dingo.pipe.importance_sampling.ImportanceSamplingInput(args, unknown_args)
Bases:
Input
- property calibration_marginalization_kwargs
- property importance_sampling_settings
- property priors
Read in and compose the prior at run-time
- run_sampler()
- dingo.pipe.importance_sampling.create_sampling_parser()
Data analysis parser creation
- dingo.pipe.importance_sampling.main()
Data analysis main logic
dingo.pipe.main module
- class dingo.pipe.main.MainInput(args, unknown_args, importance_sampling_updates)
Bases:
MainInput
- property priors
Read in and compose the prior at run-time
- property request_cpus_importance_sampling
- dingo.pipe.main.fill_in_arguments_from_model(args)
- dingo.pipe.main.main()
- dingo.pipe.main.write_complete_config_file(parser, args, inputs, input_cls=<class 'dingo.pipe.main.MainInput'>)
dingo.pipe.parser module
- class dingo.pipe.parser.StoreBoolean(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)
Bases:
Action
argparse class for robust handling of booleans with configargparse
When using configargparse, if the argument is setup with action=”store_true”, but the default is set to True, then there is no way, in the config file to switch the parameter off. To resolve this, this class handles the boolean properly.
- dingo.pipe.parser.create_parser(top_level=True)
Creates the BilbyArgParser for dingo_pipe
- Parameters:
top_level – If true, parser is to be used at the top-level with requirement checking etc., else it is an internal call and will be ignored.
- Returns:
parser – Argument parser
- Return type:
BilbyArgParser instance
dingo.pipe.plot module
- dingo.pipe.plot.create_parser()
Generate a parser for the plot script
- Returns:
parser – A parser with all the default options already added
- Return type:
BilbyArgParser
- dingo.pipe.plot.main()
dingo.pipe.sampling module
Script to sample from a Dingo model. Based on bilby_pipe data analysis script.
- class dingo.pipe.sampling.SamplingInput(args, unknown_args)
Bases:
Input
- property density_recovery_settings
- run_sampler()
- dingo.pipe.sampling.create_sampling_parser()
Data analysis parser creation
- dingo.pipe.sampling.main()
Data analysis main logic
dingo.pipe.utils module
Module contents
Module contents
References
Dingo is based on a series of papers developing neural posterior estimation for gravitational waves, starting from proof of concept [1], to inclusion of all 15 parameters and analysis of real data [2], noise conditioning and full amortization [3], and group-equivariant NPE [4]. Dingo results are augmented with importance sampling in [5]. Finally, training with forecasted noise (needed for training prior to an observing run) is described in [6].
Stephen R. Green, Christine Simpson, and Jonathan Gair. Gravitational-wave parameter estimation with autoregressive neural network flows. Phys. Rev. D, 102:104057, 2020. arXiv:2002.07656, doi:10.1103/PhysRevD.102.104057.
Stephen R. Green and Jonathan Gair. Complete parameter inference for GW150914 using deep learning. Mach. Learn. Sci. Tech., 2(3):03LT01, 2021. arXiv:2008.03312, doi:10.1088/2632-2153/abfaed.
Maximilian Dax, Stephen R. Green, Jonathan Gair, Jakob H. Macke, Alessandra Buonanno, and Bernhard Schölkopf. Real-Time Gravitational Wave Science with Neural Posterior Estimation. Phys. Rev. Lett., 127(24):241103, 2021. arXiv:2106.12594, doi:10.1103/PhysRevLett.127.241103.
Maximilian Dax, Stephen R. Green, Jonathan Gair, Michael Deistler, Bernhard Schölkopf, and Jakob H. Macke. Group equivariant neural posterior estimation. International Conference on Learning Representations, 2022. arXiv:2111.13139.
Maximilian Dax, Stephen R. Green, Jonathan Gair, Michael Pürrer, Jonas Wildberger, Jakob H. Macke, Alessandra Buonanno, and Bernhard Schölkopf. Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference. 10 2022. arXiv:2210.05686.
Jonas Wildberger, Maximilian Dax, Stephen R. Green, Jonathan Gair, Michael Pürrer, Jakob H. Macke, Alessandra Buonanno, and Bernhard Schölkopf. Adapting to noise distribution shifts in flow-based gravitational-wave inference. 11 2022. arXiv:2211.08801.
If you use Dingo in your work, we ask that you please cite at least [3].
Contributors to the code are listed in AUTHORS.md. We thank Vivien Raymond and Rory Smith for acting as LIGO-Virgo-KAGRA (LVK) code reviewers. Dingo makes use of many LVK software tools, including Bilby, bilby_pipe, and LALSimulation, as well as third party tools such as PyTorch and nflows.
Contact
For questions or comments please contact Maximilian Dax or Stephen Green.