The Result class

The Result class stores the output of a Sampler run, namely a collection of samples. It contains several methods for operating on the samples, including for importance sampling, plotting, and density recovery:

class dingo.gw.result.Result(**kwargs)

Bases: Result

A dataset class to hold a collection of gravitational-wave parameter samples and perform various operations on them.

Compared to the base class, this class implements the domain, prior, and likelihood. It also includes a method for sampling the binary reference phase parameter based on the other parameters and the likelihood.

Attributes:

samplespd.Dataframe: Contains parameter samples, as well as (possibly) log_prob, log_likelihood, weights, log_prior, delta_log_prob_target.
domainDomain: The domain of the data (e.g., UniformFrequencyDomain), needed for calculating likelihoods.
priorPriorDict: The prior distribution, used for importance sampling.
likelihoodLikelihood: The Likelihood object, needed for importance sampling.
contextdict: Context data from which the samples were produced (e.g., strain data, ASDs).
metadatadict: Metadata inherited from the Sampler object. This describes the neural networks and sampling settings used.
event_metadatadict: Metadata for the event analyzed, including time, data conditioning, channel, and detector information.
log_evidencefloat: Calculated log(evidence) after importance sampling.
log_evidence_stdfloat (property): Standard deviation of the log(evidence)
effective_sample_size, n_efffloat (property): Number of effective samples, (sum_i w_i)^2 / sum_i w_i^2
sample_efficiencyfloat (property): Number of effective samples / Number of samples
synthetic_phase_kwargsdict: kwargs describing the synthetic phase sampling.

For constructing, provide either file_name, or dictionary containing data and settings entries, or neither.

Parameters:

file_name (str) – HDF5 file containing a dataset
dictionary (dict) – Contains settings and data entries. The data keys should be the same as save_keys
data_keys (list) – Variables that should be saved / loaded. This allows for class to store additional variables beyond those that are saved. Typically, this list would be provided by any subclass.
leave_on_disk_keys (Optional[list]) – Keys for which the values are not loaded into RAM when initializing the dataset. This reduces the memory footprint during training. Instead, the values are loaded from the HDF5 file during training.
dtype_map (Optional[dict]) – Mapping from group names to target numpy dtypes for loading. Passed to recursive_hdf5_load to enable direct dtype conversion during HDF5 read, avoiding intermediate memory allocation.

get_all_injection_credible_levels(keys: list[str] | None = None, weighted: bool = False)

Get credible levels for all parameters.

Adapted from Bilby.

Parameters:

keys (list, optional) – A list of keys for which return the credible levels, if None, defaults to search_parameter_keys
weighted (bool, optional) – Whether to use sample weights in calculating credible level.

Returns:

credible_levels – The credible levels at which the injected parameters are found.

Return type:

dict

get_injection_credible_level(parameter: str, weighted: bool = False)

Get the credible level of the injected parameter.

Calculated as CDF(injection value).

Adapted from Bilby.

Parameters:

parameter (str) – Parameter to get credible level for
weighted (bool, optional) – Whether to use sample weights in calculating credible level.

Returns:

float

Return type:

credible level

get_one_dimensional_median_and_error_bar(key: str, weighted: bool = False, fmt: str = '.2f', quantiles: list = [0.16, 0.84])

Calculate the median and the lower and upper error bars for a given parameter, based on the specified quantiles.

Parameters:

key (str) – The parameter key for which to calculate the median and error bar
weighted (bool, optional) – If True, use the importance weights stored in the “weights” column of the samples. Default is False.
fmt (str, ('.2f')) – Format string used for the values in the returned LaTeX string. Default is ‘.2f’.
quantiles (list, optional) – A length-2 list with the lower and upper quantiles defining the error bars. Default is [0.16, 0.84] (68% credible interval).

Returns:

summary – An object with attributes median, lower, upper, and string, where string is a LaTeX representation of the form \(median_{-lower}^{+upper}\).

Return type:

namedtuple

get_pesummary_samples(num_processes=1, resampling_method='clip+rejection')

Samples in a form suitable for PESummary.

These samples are adjusted to undo certain conventions used internally by Dingo:

Times are corrected by the reference time t_ref.

Samples are unweighted, using a fixed random seed for sampling importance

resampling. * The spin angles phi_jl and theta_jn are transformed to account for a difference in phase definition. * Some columns are dropped: delta_log_prob_target, log_prob

Parameters:

num_processes (int) – Number of processes for spin conversion.
resampling_method (str) – Method for producing unweighted samples from weighted ones. ‘clip+rejection’: clip extreme weights then rejection sample (default). ‘sir’: sampling importance resampling (old behavior).

get_samples_bilby_phase(num_processes=1)

Convert the spin angles phi_jl and theta_jn to account for a difference in phase definition compared to Bilby.

Parameters:: num_processes (int) – Number of parallel processes.
Returns:: Samples
Return type:: pd.DataFrame

importance_sample(num_processes: int = 1, **likelihood_kwargs)

Calculate importance weights for samples.

Importance sampling starts with samples have been generated from a proposal distribution q(theta), in this case a neural network model. Certain networks (i.e., non-GNPE) also provide the log probability of each sample, which is required for importance sampling.

Given the proposal, we re-weight samples according to the (un-normalized) target distribution, which we take to be the likelihood L(theta) times the prior pi(theta). This gives sample weights

w(theta) ~ pi(theta) L(theta) / q(theta),

where the overall normalization does not matter (and we take to have mean 1). Since q(theta) enters this expression, importance sampling is only possible when we know the log probability of each sample.

As byproducts, this method also estimates the evidence and effective sample size of the importance sampled points.

This method modifies the samples pd.DataFrame in-place, adding new columns for log_likelihood, log_prior, and weights. It also stores the log_evidence as an attribute.

Parameters:

num_processes (int) – Number of parallel processes to use when calculating likelihoods. (This is the most expensive task.)
likelihood_kwargs (dict) – kwargs that are forwarded to the likelihood constructor. E.g., options for marginalization.

classmethod merge(parts)

Merge several Result instances into one. Check that they are compatible, in the sense of having the same metadata. Finally, calculate a new log evidence for the combined result.

This is useful when recombining separate importance sampling jobs.

Parameters:: parts (list[Result]) – List of sub-Results to be combined.
Return type:: Combined Result.

parameter_subset(parameters)

Return a new object of the same type, with only a subset of parameters. Drops all other columns in samples DataFrame as well (e.g., log_prob, weights).

Parameters:: parameters (list) – List of parameters to keep.
Return type:: Result

property pesummary_prior

The prior in a form suitable for PESummary.

By convention, Dingo stores all times relative to a reference time, typically the trigger time for an event. The prior returned here corrects for that offset to be consistent with other codes.

plot_corner(parameters: list | None = None, filename: str = 'corner.pdf', **kwargs)

Generate a corner plot of the samples.

Parameters:

parameters (list[str]) – List of parameters to include. If None, include all parameters. (Default: None)
filename (str) – Where to save samples.
legend_font_size (int) – Font size of the legend.

plot_log_probs(filename='log_probs.png'): Make a scatter plot of the target versus proposal log probabilities. For the target, subtract off the log evidence.

plot_weights(filename='weights.png'): Make a scatter plot of samples weights vs log proposal.

print_summary(): Display the number of samples, and (if importance sampling is complete) the log evidence and number of effective samples.

rejection_sample(max_samples_per_draw: int = 1, clip_weights: bool = False, random_state=None)

Generate unweighted posterior samples from weighted ones via rejection sampling.

Each original sample contributes at most max_samples_per_draw copies to the output, so the result avoids the excessive duplication that sampling_importance_resampling() can produce for high-weight samples.

Algorithm (unbiased, maximum efficiency)

The weights are first scaled so that the largest weight equals max_samples_per_draw. Each sample i then contributes

floor(w_scaled[i]) copies deterministically (integer part), and
one additional copy with probability w_scaled[i] - floor(w_scaled[i]) (fractional part, a single Bernoulli draw).

The expected number of copies of sample i is therefore exactly w_scaled[i] ∝ w[i], which guarantees an unbiased representation of the posterior. Using the integer part deterministically (rather than rounding stochastically) maximises the expected total number of output samples for a given max_samples_per_draw.

Optional weight clipping

When clip_weights=True, the ceil(sqrt(N)) largest weights are replaced by their mean and the weights are re-normalized to mean 1 before rejection sampling. This number of clips is the theoretically optimal choice that yields asymptotically unbiased results [1]_. Using the mean (rather than the minimum) of the clipped group preserves their total weight, which minimises the bias introduced by clipping. The net effect is reduced weight variance and a larger expected number of output samples.

If the samples DataFrame has no weights column the samples are already unweighted and are returned unchanged.

Parameters:

max_samples_per_draw (int) – Maximum number of copies any single input sample may contribute to the output. Default is 1 (standard rejection sampling, no duplicates).
clip_weights (bool) – Whether to clip the ceil(sqrt(N)) largest weights to their mean before rejection sampling. Default is False.
random_state (int or None) – Seed for the random number generator, for reproducibility.

Returns:

Unweighted samples (the weights column is dropped).

Return type:

pd.DataFrame

References

reset_event(event_dataset)

Set the Result context and event_metadata based on an EventDataset.

If these attributes already exist, perform a comparison to check for changes. Update relevant objects appropriately. Note that setting context and event_metadata attributes directly would not perform these additional checks and updates.

Parameters:: event_dataset (EventDataset) – New event to be used for importance sampling.

sample_calibration_parameters(calibration_sampling_kwargs: dict)

Sample calibration parameters from the calibration prior and add them to the samples DataFrame. Also updates self.prior with the calibration priors and adjusts self.samples[“log_prob”] accordingly.

This should be called before importance_sample() when importance sampling over calibration uncertainty. The calibration prior log_prob is added to self.samples[“log_prob”] so that it is properly accounted for in the importance sampling weights.

After calling this method, each sample will have calibration parameters (e.g., recalib_H1_amplitude_0, recalib_H1_phase_0, etc.) that will be used by the likelihood to apply calibration corrections.

Parameters:

calibration_sampling_kwargs (dict) –

Calibration sampling parameters. Keys:

calibration_envelopedict: Dictionary of the form {“H1”: filepath, “L1”: filepath, …} with locations of calibration envelope files (.txt).
num_calibration_nodesint: Number of log-spaced frequency nodes for the calibration spline model.
correction_typestr or dict or None, default “data”: Whether envelopes are over eta (“data”) or alpha (“template”). Can be a string (applied to all detectors), a dict mapping ifo names to correction types, or None (uses defaults from CALIBRATION_CORRECTION_TYPE_LOOKUP).

sample_synthetic_phase(synthetic_phase_kwargs, inverse: bool = False)

Sample a synthetic phase for the waveform. This is a post-processing step applicable to samples theta in the full parameter space, except for the phase parameter (i.e., 14D samples). This step adds a phase parameter to the samples based on likelihood evaluations.

A synthetic phase is sampled as follows.

Compute and cache the modes for the waveform mu(theta, phase=0) for phase 0, organize them such that each contribution m transforms as exp(-i * m * phase).

Compute the likelihood on a phase grid, by computing mu(theta, phase) from the cached modes. In principle this likelihood is exact, however, it can deviate slightly from the likelihood computed without cached modes for various technical reasons (e.g., slightly different windowing of cached modes compared to full waveform when transforming TD waveform to FD). These small deviations can be fully accounted for by importance sampling. Note: when approximation_22_mode=True, the entire waveform is assumed to transform as exp(2i*phase), in which case the likelihood is only exact if the waveform is fully dominated by the (2, 2) mode.

Build a synthetic conditional phase distribution based on this grid. We use an interpolated prior distribution bilby.core.prior.Interped, such that we can sample and also evaluate the log_prob. We add a constant background with weight self.synthetic_phase_kwargs to the kde to make sure that we keep a mass-covering property. With this, the importance sampling will yield exact results even when the synthetic phase conditional is just an approximation.

Besides adding phase samples to self.samples[‘phase’], this method also modifies self.samples[‘log_prob’] by adding the log_prob of the synthetic phase conditional.

This method modifies self.samples in place.

Parameters:

synthetic_phase_kwargs (dict) –

This should consist of the kwargs
approximation_22_mode (optional) num_processes (optional) n_grid uniform_weight (optional)
inverse (bool, default False) – Whether to apply instead the inverse transformation. This is used prior to calculating the log_prob. In inverse mode, the posterior probability over phase is calculated for given samples. It is stored in self.samples[ ‘log_prob’].

sampling_importance_resampling(num_samples=None, random_state=None)

Generate unweighted posterior samples from weighted ones. New samples are sampled with probability proportional to the sample weight. Resampling is done with replacement, until the desired number of unweighted samples is obtained.

Parameters:

num_samples (int) – Number of samples to resample.
random_state (int or None) – Sampling seed.

Returns:

Unweighted samples

Return type:

pd.Dataframe

split(num_parts)

Split the Result into a set of smaller results. The samples are evenly divided among the sub-results. Additional information (metadata, context, etc.) are copied into each.

This is useful for splitting expensive tasks such as importance sampling across multiple jobs.

Parameters:: num_parts (int) – The number of parts to split the Result across.
Return type:: list of sub-Results.

train_unconditional_flow(parameters, nde_settings: dict, train_dir: str | None = None, threshold_std: float | None = inf)

Train an unconditional flow to represent the distribution of self.samples.

Parameters:

parameters (list) – List of parameters over which to train the flow. Can be a subset of the existing parameters.
nde_settings (dict) – Configuration settings for the neural density estimator.
train_dir (Optional[str]) – Where to save the output of network training, e.g., logs, checkpoints. If not provide, a temporary directory is used.
threshold_std (Optional[float]) – Drop samples more than threshold_std standard deviations away from the mean (in any parameter) before training the flow. This is meant to remove outlier samples.

Return type:

PosteriorModel

update_prior(prior_update)

Update the prior based on a new dict of priors. Use the existing prior for parameters not included in the new dict.

If class samples have not been importance sampled, then save new sample weights based on the new prior. If class samples have been importance sampled, then update the weights.

Parameters:: prior_update (dict) – Priors to update. This should be of the form {key : prior_str}, where str is a string that can instantiate a prior via PriorDict(prior_update). The prior_update is provided in this form so that it can be properly saved with the Result and later instantiated.

Following a sampler run, a Result can be obtained using Sampler.to_result(). Since Result inherits from DingoDataset it also possesses to_file() and to_dictionary() methods for saving samples and associated metadata (including context data, namely event data and ASDs).

Density recovery

When sampling with GNPE, there is no direct access to the probability density \(q(\theta|d)\). This is because of the Gibbs iterations: one only has access to the probability density of the entire chain, not just the final samples. The probability density is, however, needed for importance sampling, since this is the proposal distribution.

The Result class contains methods to enable recovery of the probability density for a collection of samples. The approach is as follows:

Start from the samples \(\{(\theta_i, \hat\theta_i)\}_{i=1}^N\) from the final Gibbs iteration, including parameters \(\theta\) and proxy parameters \(\hat\theta\). By default these are included in the samples attribute generated by the Sampler.
Train an unconditional density estimator \(q(\hat\theta)\) to model the proxy parameters. This is done by (1) using parameter_subset() to produce a new Result containing just the proxies, and (2) using train_unconditional_flow() on this subset.
Generate new samples \((\theta, \hat\theta) \sim q(\theta, \hat\theta | d) = q(\theta | d, \hat\theta) q(\hat\theta)\). This can be accomplished using GNPESampler.sample() with num_iterations = 1 and setting the initial sampler to be the unconditional flow trained in the previous step. Since this does not involve multiple iterations, the density is obtained as well, so importance sampling can be performed.

Note

Density recovery can also be achieved using an unconditional density estimator for \(\theta\) (trained on samples \(\{\theta_i\}_{i=1}^N\) from GNPE). Since \(\theta\) typically comprises 14 parameters (versus 2 or 3 for \(\hat\theta\)) it is usually more straightforward to learn the proxies.

Synthetic phase

It is often challenging for Dingo to learn to model the phase parameter \(\phi_c\). For this reason, we usually marginalize over it in training by excluding it from the list of inference_parameters. The phase is, however, required for importance sampling unless using also a phase-marginalized likelihood (which is approximate except under special circumstances).

The Dingo gw.Result class includes a method sample_synthetic_phase() which produces a \(\phi_c\) sample from a \(\phi_c\)-marginalized sample. It does so by evaluating the likelihood on a \(\phi_c\)-grid and then sampling from the associated 1D distribution. The log_prob value for the sample is also corrected to reflect the sampled \(\phi_c\). Speed is ensured by caching waveform modes and evaluating the polarizations for different \(\phi_c\). For further details, see the Supplemental Material of [5].

This method should be run after recovering the density, since in particular it applies a correction to the density.

Configuration

The method sample_synthetic_phase() takes a kwargs argument. An example configuration is

approximation_22_mode: false
n_grid: 5001
uniform_weight: 0.01
num_processes: 100

approximation_22_mode: Whether to make the approximation that only the \((l, m) = (2, 2)\) mode is present, i.e., waveforms transform as \(\exp{2\pi i \phi_c}\). This simplifies computations since it does not require caching of waveform modes.
n_grid: Specifies the phase grid on which the likelihoods are evaluated.
uniform_weight: Base probability level to add to ensure mass coverage.
num_processes: For parallelization of synthetic phase sampling. This is usually the most expensive part of importance sampling, so it is advantageous to perform calculations in parallel.

Importance sampling

Once samples are in the right form—including all relevant parameters and the log probability—importance sampling is carried out using the importance_sample() method. It allows to specify options for using a marginalized likelihood. (Time and phase marginalization are separately supported; see the documentation of dingo.gw.likelihood.StationaryGaussianGWLikelihood.)

As with the synthetic phase, importance sampling allows for parallelization.

Plotting

The plotting methods included here are intended for quick plots for evaluating results. They include

corner plots comparing importance sampled and proposal results;
weights plots to evaluate performance of importance sampling; and
log probability plots comparing target and proposal log probability.