# Distance functions¶

Distance functions measure closeness of observed and sampled data. For custom distance functions, either pass a plain function to ABCSMC or subclass the DistanceFunction class if finer grained configuration is required.

class pyabc.distance_functions.AcceptAllDistance(require_initialize: bool = True)

Just a mock distance function which always returns -1. So any sample should be accepted for any sane epsilon object.

Can be used for testing.

__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
class pyabc.distance_functions.AdaptivePNormDistance(p: float = 2, adaptive: bool = True, scale_type: int = 1)

In the p-norm distance, adapt the weights for each generation, based on the previous simulations.

Parameters: p (float) – p for p-norm. Required p >= 1, p = np.inf allowed (infinity-norm). adaptive (bool) – True: Adapt distance after each iteration. False: Adapt distance only once at the beginning in initialize(). This corresponds to a pre-calibration. scale_type (int) – What measure to use for deviation. Currently supports SCALE_TYPE_MAD for the median absolute deviation (might be more tolerant to outliers), and SCALE_TYPE_SD for the standard deviation.
configure_sampler(sampler: pyabc.sampler.base.Sampler)

Make the sampler return also rejected summary statistics if required, because these are needed to get a better estimate of the summary statistic variabilities.

Parameters: sampler (Sampler) – The sampler employed.
initialize(t: int, sample_from_prior: List[dict])

Initialize weights.

update(t: int, all_sum_stats: List[dict])

Update weights based on all simulations.

class pyabc.distance_functions.DistanceFunction(require_initialize: bool = True)

Bases: abc.ABC

Abstract base class for distance functions.

Any other distance function should inherit from this class.

__call__(t: int, x: dict, x_0: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
configure_sampler(sampler: pyabc.sampler.base.Sampler)

This is called by the ABCSMC class and gives the distance function the opportunity to configure the sampler. For example, the distance function might request the sampler to also return rejected particles and their summary statistics in order to adapt the distance functions to the statistics of the sample.

The default is to do nothing.

Parameters: sampler (Sampler) – The Sampler used in ABCSMC.
get_config() → dict

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
initialize(t: int, sample_from_prior: List[dict])

This method is called by the ABCSMC framework before the first use of the distance function (in new and load) and can be used to calibrate it to the statistics of the samples.

The default implementation is to do nothing.

This function is only called if require_initialize == True.

Parameters: t (int) – Time point for which to initialize the distance function. sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
to_json() → str

Return JSON encoded configuration of the distance function.

Returns: json_str – JSON encoded string describing the distance function. The default implementation is to try to convert the dictionary returned my get_config. str
update(t: int, all_sum_stats: List[dict]) → bool

Update the distance function. Default: Do nothing.

Parameters: t (int) – Time point for which to update/create the distance measure. all_sum_stats (List[dict]) – List of all summary statistics that should be used to update the distance (in particular also rejected ones). is_updated – True: If distance function has changed compared to hitherto. False: If distance function has not changed (default). bool
class pyabc.distance_functions.DistanceFunctionWithMeasureList(measures_to_use='all')

Base class for distance functions with measure list. This class is not functional on its own.

Parameters: measures_to_use (Union[str, List[str]]) – If set to “all”, all measures are used. This is the default. If a list is provided, the measures in the list are used. measures refers to the summary statistics.
get_config()

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
initialize(t: int, sample_from_prior)

This method is called by the ABCSMC framework before the first use of the distance function (in new and load) and can be used to calibrate it to the statistics of the samples.

The default implementation is to do nothing.

This function is only called if require_initialize == True.

Parameters: t (int) – Time point for which to initialize the distance function. sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
measures_to_use = None

The measures (summary statistics) to use for distance calculation.

class pyabc.distance_functions.IdentityFakeDistance(require_initialize: bool = True)

A fake distance function, which just passes the summary statistics on. This class assumes that the model already returns the distance. This can be useful in cases where simulating can be stopped early, when during the simulation some condition is reached which makes it impossible to accept the particle.

__call__(t: int, x: dict, y: dict)

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
class pyabc.distance_functions.MinMaxDistanceFunction(measures_to_use='all')

Calculate upper and lower margins as max and min of the parameters. This works surprisingly well for normalization in simple cases

static lower(parameter_list)

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. lower_margin – The lower margin of the range calculated from these parameters float
static upper(parameter_list)

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. upper_margin – The upper margin of the range calculated from these parameters float
class pyabc.distance_functions.NoDistance

Implements a kind of null object as distance function.

This can be used as a dummy distance function if e.g. integrated modeling is used.

Note

This distance function cannot be evaluated, so currently it is in particular not possible to use an epsilon threshold which requires initialization (i.e. eps.require_initialize==True is not possible).

__call__(t: int, x: dict, x_0: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
class pyabc.distance_functions.PCADistanceFunction(measures_to_use='all')

Calculate distance in whitened coordinates.

A whitening transformation $$X$$ is calculated from an initial sample. The distance is measured as euclidean distance in the transformed space. I.e

$d(x,y) = \| Wx - Wy \|$
__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
initialize(t: int, sample_from_prior)

This method is called by the ABCSMC framework before the first use of the distance function (in new and load) and can be used to calibrate it to the statistics of the samples.

The default implementation is to do nothing.

This function is only called if require_initialize == True.

Parameters: t (int) – Time point for which to initialize the distance function. sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
class pyabc.distance_functions.PNormDistance(p: float = 2, w: dict = None)

Use weighted p-norm

$d(x, y) = \left[\sum_{i} \left w_i| x_i-y_i \right|^{p} \right]^{1/p}$

to compute distances between sets of summary statistics. E.g. set p=2 to get a Euclidean distance.

Parameters: p (float) – p for p-norm. Required p >= 1, p = np.inf allowed (infinity-norm). w (dict) – Weights. Dictionary indexed by time points. Each entry contains a dictionary of numeric weights, indexed by summary statistics labels. If none is passed, a weight of 1 is considered for every summary statistic. If no entry is available in w for a given time point, the maximum available time point is selected.
__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
get_config() → dict

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
class pyabc.distance_functions.PercentileDistanceFunction(measures_to_use='all')

Calculate normalization 20% and 80% from percentiles as lower and upper margins

PERCENTILE = 20

The percentiles

get_config()

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
static lower(parameter_list)

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. lower_margin – The lower margin of the range calculated from these parameters float
static upper(parameter_list)

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. upper_margin – The upper margin of the range calculated from these parameters float
class pyabc.distance_functions.RangeEstimatorDistanceFunction(measures_to_use='all')

Abstract base class for distance functions which estimate is based on a range.

It defines the two template methods lower and upper.

Hence

$d(x, y) = \sum_{i \in \text{measures}} \left | \frac{x_i - y_i}{u_i - l_i} \right |$

where $$l_i$$ and $$u_i$$ are the lower and upper margin for measure $$i$$.

__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
get_config()

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
initialize(t: int, sample_from_prior)

This method is called by the ABCSMC framework before the first use of the distance function (in new and load) and can be used to calibrate it to the statistics of the samples.

The default implementation is to do nothing.

This function is only called if require_initialize == True.

Parameters: t (int) – Time point for which to initialize the distance function. sample_from_prior (List[dict]) – List of dictionaries containing the summary statistics.
static lower(parameter_list: List[float])

Calculate the lower margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. lower_margin – The lower margin of the range calculated from these parameters float
static upper(parameter_list: List[float])

Calculate the upper margin form a list of parameter values.

Parameters: parameter_list (List[float]) – List of values of a parameter. upper_margin – The upper margin of the range calculated from these parameters float
class pyabc.distance_functions.SimpleFunctionDistance(function)

This is a wrapper around a simple function which calculates the distance. If a function is passed to the ABCSMC class, then it is converted to an instance of the SimpleFunctionDistance class.

Parameters: function (Callable) – A Callable accepting two parameters, namely summary statistics x and y.
__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
get_config()

Return configuration of the distance function.

Returns: config – Dictionary describing the distance function. dict
class pyabc.distance_functions.ZScoreDistanceFunction(measures_to_use='all')

Calculate distance as sum of ZScore over the selected measures. The measured Data is the reference for the ZScore.

Hence

$d(x, y) = \sum_{i \in \text{measures}} \left| \frac{x_i-y_i}{y_i} \right|$
__call__(t: int, x: dict, y: dict) → float

Evaluate, at time point t, the distance of the tentatively sampled particle to the measured data.

Abstract method. This method has to be overwritten by all concrete implementations.

Parameters: t (int) – Time point at which to evaluate the distance. x (dict) – Summary statistics of the tentatively sampled parameter. x_0 (dict) – Summary statistics of the measured data. distance – Attributes distance of the tentatively sampled particle from the measured data. float
pyabc.distance_functions.median_absolute_deviation(data: List)

Calculate the sample median absolute deviation (MAD), defined as median(abs(data - median(data)).

Parameters: data (List) – List of data points. mad – The median absolute deviation of the data. float
pyabc.distance_functions.standard_deviation(data: List)

Calculate the sample standard deviation (SD).

Parameters: data (List) – List of data points. sd – The standard deviation of the data points. float
pyabc.distance_functions.to_distance(maybe_distance_function)
Parameters: maybe_distance_function (either a Callable, which takes two arguments, or) – DistanceFunction instance. (a) –