Inference
With a trained network, inference can be performed on injections or real data. For injections, see the discussion in the examples. For real data, we recommend to use dingo_pipe.
The Sampler class
Inference uses the Sampler class, or more specifically, the GWSampler class,
which inherits from it.
- class dingo.gw.inference.gw_samplers.GWSampler(**kwargs)
Bases:
GWSamplerMixin,SamplerSampler for gravitational-wave inference using neural posterior estimation. Augments the base class by defining transform_pre and transform_post to prepare data for the inference network.
- transform_pre :
Decimates data (if necessary and using MultibandedFrequencyDomain).
Whitens strain.
Repackages strain data and the inverse ASDs (suitably scaled) into a torch tensor.
- transform_post :
Extract the desired inference parameters from the network output ( array-like), de-standardize them, and repackage as a dict.
Also mixes in GW functionality for building the domain and correcting the reference time.
Allows for conditional and unconditional models, and draws samples from the model based on (optional) context data.
This is intended for use either as a standalone sampler, or as a sampler producing initial sample points for a GNPE sampler.
- Parameters:
kwargs – Keyword arguments that are forwarded to the superclass.
- property context
Data on which to condition the sampler. For injections, there should be a ‘parameters’ key with truth values.
- property event_metadata
Metadata for data analyzed. Can in principle influence any post-sampling parameter transformations (e.g., sky position correction), as well as the likelihood detector positions.
- log_prob(samples: DataFrame | dict) ndarray
Calculate the model log probability at specific sample points.
- Parameters:
samples (pd.DataFrame | dict) – Sample points at which to calculate the log probability.
- Return type:
np.array of log probabilities.
- run_sampler(num_samples: int, batch_size: int | None = None)
Generates samples and stores them in self.samples. Conditions the model on self.context if appropriate (i.e., if the model is not unconditional).
If possible, it also calculates the log_prob and saves it as a column in self.samples. When using GNPE it is not possible to obtain the log_prob due to the many Gibbs iterations. However, in the case of just one iteration, and when starting from a sampler for the proxy, the GNPESampler does calculate the log_prob.
Allows for batched sampling, e.g., if limited by GPU memory. Actual sampling for each batch is performed by _run_sampler(), which will differ for Sampler and GNPESampler.
- Parameters:
num_samples (int) – Number of samples requested.
batch_size (int, optional) – Batch size for sampler.
This is instantiated based on a PosteriorModel. To draw samples, the context property must first be set to the data to be analyzed. For gravitational waves this should be a dictionary with the following keys:
- waveform
(unwhitened) strain data in each detector
- asds
noise ASDs estimated in each detector at the time of the event
- parameters (optional)
for injections, the true parameters of the signal (for saving; ignored for sampling)
Once this is set, the run_sampler() method draws the requested samples from the posterior conditioned on the context. It applies some post-processing (to de-standardize the data, and to correct for the rotation of the Earth between the network reference time and the event time), and then stores the result as a DataFrame in GWSampler.samples. The DataFrame contains columns for each inference parameter, as well as the log probability of the sample under the posterior model.
The GWSampler.metadata attribute contains all settings that went into producing the samples, including training datasets, network training settings, event metadata (for real events) and possible injection parameters. Finally, the to_samples_dataset() method returns a SamplesDataset containing all results, including the samples, settings, and context. This can be saved easily as HDF5.
Injections
Injections (i.e., simulated data) are produced using the Injection class. It includes options for fixed or random parameters (drawn from a prior), and it returns injections in a format that can be directly set as GWSampler.context.
- class dingo.gw.injection.Injection(prior, **gwsignal_kwargs)
Bases:
GWSignalProduces injections of signals (with random or specified parameters) into stationary Gaussian noise. Output is not whitened.
- Parameters:
prior (PriorDict) – Prior used for sampling random parameters.
gwsignal_kwargs – Arguments to be passed to GWSignal base class.
- classmethod from_posterior_model_metadata(metadata)
Instantiate an Injection based on a posterior model. The prior, waveform settings, etc., will all be consistent with what the model was trained with.
- Parameters:
metadata (dict) – Dict which you can get via PosteriorModel.metadata
- injection(theta)
Generate an injection based on specified parameters.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Parameters:
theta (dict) – Parameters used for injection.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
- random_injection()
Generate a random injection.
This is a signal + noise consistent with the amplitude spectral density in self.asd. If self.asd is an ASDDataset, then it uses a random ASD from this dataset.
Data are not whitened.
- Returns:
- keys:
waveform: data (signal + noise) in each detector extrinsic_parameters: {} parameters: waveform parameters asd (if set): amplitude spectral density for each detector
- Return type:
dict
Hint
The convenience class method from_posterior_model_metadata() instantiates an Injection with all of the settings that went into the posterior model. To this class pass the PosteriorModel.metadata dictionary. It should produce injections that perfectly match the characteristics of the training data (waveform approximant, data conditioning, noise characteristics, etc.). This can be very useful for testing a trained model.
Important
Repeated calls to Injection.injection(), even with the same parameters, will produce injections with different noise realizations (which therefore lead to different posteriors). For repeated analyses of the exact same injection (e.g., with different models or codes) it is necessary to either save the injection for re-use or fix a random seed.