Distance functions¶
Distance functions measure closeness of observed and sampled data. This module implements various commonly used distance functions for ABC, featuring a few advanced concepts.
For custom distance functions, either pass a plain function to ABCSMC or subclass the pyabc.Distance class.

class
pyabc.distance.
AcceptAllDistance
[source]¶ Bases:
pyabc.distance.base.Distance
Just a mock distance function which always returns 1. So any sample should be accepted for any sane epsilon object.
Can be used for testing.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float


class
pyabc.distance.
AdaptiveAggregatedDistance
(distances: List[pyabc.distance.base.Distance], initial_weights: List = None, factors: Union[List, dict] = None, adaptive: bool = True, scale_function: Callable = None, log_file: str = None)[source]¶ Bases:
pyabc.distance.distance.AggregatedDistance
Adapt the weights of AggregatedDistances automatically over time.
 Parameters
distances – As in AggregatedDistance.
initial_weights – Weights to be used in the initial iteration. List with a weight for each distance function.
factors – As in AggregatedDistance.
adaptive – True: Adapt weights after each iteration. False: Adapt weights only once at the beginning in initialize(). This corresponds to a precalibration.
scale_function – Function that takes a list of floats, namely the values obtained by applying one of the distances passed to a set of samples, and returns a single float, namely the weight to apply to this distance function. Default: scale_span.
log_file – A log file to store weights for each time point in. Weights are currently not stored in the database. The data are saved in json format and can be retrieved via pyabc.storage.load_dict_from_json.

__init__
(distances: List[pyabc.distance.base.Distance], initial_weights: List = None, factors: Union[List, dict] = None, adaptive: bool = True, scale_function: Callable = None, log_file: str = None)[source]¶  Parameters
distances (List) – The distance functions to apply.
weights (Union[List, dict], optional (default = [1,..])) – The weights to apply to the distances when taking the sum. Can be a list with entries in the same order as the distances, or a dictionary of lists, with the keys being the single time points (if the weights should be iterationspecific).
factors (Union[List, dict], optional (dfault = [1,..])) – Scaling factors that the weights are multiplied with. The same structure applies as to weights. If None is passed, a factor of 1 is considered for every summary statistic. Note that in this class, factors are superfluous as everything can be achieved with weights alone, however in subclsses the factors can remain static while weights adapt over time, allowing for greater flexibility.

class
pyabc.distance.
AdaptivePNormDistance
(p: float = 2, initial_weights: dict = None, factors: dict = None, adaptive: bool = True, scale_function: Callable = None, normalize_weights: bool = True, max_weight_ratio: float = None, log_file: str = None)[source]¶ Bases:
pyabc.distance.distance.PNormDistance
In the pnorm distance, adapt the weights for each generation, based on the previous simulations. This class is motivated by 1.
 Parameters
p – p for pnorm. Required p >= 1, p = np.inf allowed (infinitynorm). Default: p=2.
initial_weights – Weights to be used in the initial iteration. Dictionary with observables as keys and weights as values.
factors – As in PNormDistance.
adaptive – True: Adapt distance after each iteration. False: Adapt distance only once at the beginning in initialize(). This corresponds to a precalibration.
scale_function – (data: list, x_0: float) > scale: float. Computes the scale (i.e. inverse weight s = 1 / w) for a given summary statistic. Here, data denotes the list of simulated summary statistics, and x_0 the observed summary statistic. Implemented are absolute_median_deviation, standard_deviation (default), centered_absolute_median_deviation, centered_standard_deviation.
normalize_weights – Whether to normalize the weights to have mean 1. This just possibly smoothes the decrease of epsilon and might aid numeric stability, but is not strictly necessary.
max_weight_ratio – If not None, large weights will be bounded by the ratio times the smallest nonzero absolute weight. In practice usually not necessary, it is theoretically required to ensure convergence.
log_file – A log file to store weights for each time point in. Weights are currently not stored in the database. The data are saved in json format and can be retrieved via pyabc.storage.load_dict_from_json.
 1
Prangle, Dennis. “Adapting the ABC Distance Function”. Bayesian Analysis, 2017. doi:10.1214/16BA1002.

__init__
(p: float = 2, initial_weights: dict = None, factors: dict = None, adaptive: bool = True, scale_function: Callable = None, normalize_weights: bool = True, max_weight_ratio: float = None, log_file: str = None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

configure_sampler
(sampler: pyabc.sampler.base.Sampler)[source]¶ Make the sampler return also rejected particles, because these are needed to get a better estimate of the summary statistic variabilities, avoiding a bias to accepted ones only.
 Parameters
sampler (Sampler) – The sampler employed.

get_config
() → dict[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

class
pyabc.distance.
AggregatedDistance
(distances: List[pyabc.distance.base.Distance], weights: Union[List, dict] = None, factors: Union[List, dict] = None)[source]¶ Bases:
pyabc.distance.base.Distance
Aggregates a list of distance functions, all of which may work on subparts of the summary statistics. Then computes and returns the weighted sum of the distance values generated by the various distance functions.
All class functions are propagated to the children and the obtained results aggregated appropriately.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Applies all distance functions and computes the weighted sum of all obtained values.

__init__
(distances: List[pyabc.distance.base.Distance], weights: Union[List, dict] = None, factors: Union[List, dict] = None)[source]¶  Parameters
distances (List) – The distance functions to apply.
weights (Union[List, dict], optional (default = [1,..])) – The weights to apply to the distances when taking the sum. Can be a list with entries in the same order as the distances, or a dictionary of lists, with the keys being the single time points (if the weights should be iterationspecific).
factors (Union[List, dict], optional (dfault = [1,..])) – Scaling factors that the weights are multiplied with. The same structure applies as to weights. If None is passed, a factor of 1 is considered for every summary statistic. Note that in this class, factors are superfluous as everything can be achieved with weights alone, however in subclsses the factors can remain static while weights adapt over time, allowing for greater flexibility.

configure_sampler
(sampler: pyabc.sampler.base.Sampler)[source]¶ Note: configure_sampler is applied by all distances sequentially, so care must be taken that they perform no contradictory operations on the sampler.

static
format_dict
(w, t, n_distances, default_val=1.0)[source]¶ Normalize weight or factor dictionary to the employed format.

get_config
() → dict[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.

update
(t: int, get_all_sum_stats: Callable[], List[dict]]) → bool[source]¶ The sum_stats are passed on to all distance functions, each of which may then update using these. If any update occurred, a value of True is returned indicating that e.g. the distance may need to be recalculated since the underlying distances changed.


class
pyabc.distance.
BinomialKernel
(p: Union[float, Callable], ret_scale: str = 'SCALE_LOG', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
A kernel with a binomial probability mass function.
 Parameters
p (Union[float, Callable]) – The success probability.
keys, pdf_max (ret_scale,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
Distance
[source]¶ Bases:
abc.ABC
Abstract base class for distance objects.
Any object that computes the similarity between observed and simulated data should inherit from this class.

abstract
__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

configure_sampler
(sampler: pyabc.sampler.base.Sampler)[source]¶ This is called by the ABCSMC class and gives the distance the opportunity to configure the sampler. For example, the distance might request the sampler to also return rejected particles in order to adapt the distance to the statistics of the sample. The method is called by the ABCSMC framework before the first used of the distance (at the beginning of ABCSMC.run()), after initialize().
The default is to do nothing.
 Parameters
sampler (Sampler) – The sampler used in ABCSMC.

get_config
() → dict[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.

to_json
() → str[source]¶ Return JSON encoded configuration of the distance.
 Returns
json_str – JSON encoded string describing the distance. The default implementation is to try to convert the dictionary returned by
get_config
. Return type
str:

update
(t: int, get_all_sum_stats: Callable[], List[dict]]) → bool[source]¶ Update the distance for the upcoming generation t.
The default is to do nothing.
 Parameters
t (int) – Time point for which to update the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on demand a list of all summary statistics from the finished generation that should be used to update the distance.
 Returns
is_updated – Whether the distance has changed compared to beforehand. Depending on the result, the population needs to be updated in ABCSMC before preparing the next generation. Defaults to False.
 Return type
bool

abstract

class
pyabc.distance.
DistanceWithMeasureList
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.base.Distance
Base class for distance functions with measure list. This class is not functional on its own.
 Parameters
measures_to_use (Union[str, List[str]]) –
If set to “all”, all measures are used. This is the default.
If a list is provided, the measures in the list are used.
measures refers to the summary statistics.

__init__
(measures_to_use='all')[source]¶ Initialize self. See help(type(self)) for accurate signature.

get_config
()[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.

class
pyabc.distance.
IdentityFakeDistance
[source]¶ Bases:
pyabc.distance.base.Distance
A fake distance function, which just passes the summary statistics on. This class assumes that the model already returns the distance. This can be useful in cases where simulating can be stopped early, when during the simulation some condition is reached which makes it impossible to accept the particle.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float


class
pyabc.distance.
IndependentLaplaceKernel
(scale: Union[Callable, List[float], float] = None, keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
This kernel can be used for efficient computations of largescale independent Laplace distributions, performing computations directly on a logscale to avoid numeric issues. In each coordinate, a 1dim Laplace distribution
\[p(x) = \frac{1}{2b}\exp (\frac{1}{b}xa)\]is assumed.
 Parameters
scale (Union[array_like, float, Callable], optional (default = ones vector)) – Scale terms b of the distribution. Can also be a Callable taking as arguments the parameters. In that case, pdf_max should also be given if it is supposed to be used. Usually, it will then be given as the density at the observed statistics assuming the minimum allowed variance.
pdf_max (keys,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None)[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
IndependentNormalKernel
(var: Union[Callable, List[float], float] = None, keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
This kernel can be used for efficient computations of largescale independent normal distributions, circumventing the covariance matrix, and performing computations directly on a logscale to avoid numeric issues.
 Parameters
var (Union[array_like, float, Callable], optional (default = ones vector)) – Variances of the distribution (assuming zeros in the offdiagonal of the covariance matrix). Can also be a Callable taking as arguments the parameters. In that case, pdf_max should also be given if it is supposed to be used. Usually, it will then be given as the density at the observed statistics assuming the minimum allowed variance.
pdf_max (keys,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None)[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
MinMaxDistance
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.distance.RangeEstimatorDistance
Calculate upper and lower margins as max and min of the parameters. This works surprisingly well for normalization in simple cases

class
pyabc.distance.
NegativeBinomialKernel
(p: float, ret_scale: str = 'SCALE_LOG', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
A kernel with a negative binomial probability mass function.
 Parameters
p (Union[float, Callable]) – The success probability.
keys, pdf_max (ret_scale,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
NoDistance
[source]¶ Bases:
pyabc.distance.base.Distance
Implements a kind of null object as distance function. This can be used as a dummy distance function if e.g. integrated modeling is used.
Note
This distance function cannot be evaluated, so currently it is in particular not possible to use an epsilon threshold which requires initialization, because during initialization the distance function is invoked directly and not via the acceptor as usual. Conceptually, this would be possible and can be implemented on request.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float


class
pyabc.distance.
NormalKernel
(cov: numpy.ndarray = None, ret_scale: str = 'SCALE_LOG', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
A kernel with a normal, i.e. Gaussian, probability density. This is just a wrapper around sp.multivariate_normal.
 Parameters
cov (array_like, optional (default = identiy matrix)) – Covariance matrix of the distribution.
keys, pdf_max (ret_scale,) –
Note
The order of the entries in the mean and cov vectors is assumed to be the same as the one in keys. If keys is None, it is assumed to be the same as the one obtained via sorted(x.keys()) for summary statistics x.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Return the value of the normal distribution at x  x_0, or its logarithm.

class
pyabc.distance.
PCADistance
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.distance.DistanceWithMeasureList
Calculate distance in whitened coordinates.
A whitening transformation \(X\) is calculated from an initial sample. The distance is measured as euclidean distance in the transformed space. I.e
\[d(x,y) = \ Wx  Wy \\]
__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

__init__
(measures_to_use='all')[source]¶ Initialize self. See help(type(self)) for accurate signature.

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.


class
pyabc.distance.
PNormDistance
(p: float = 2, weights: dict = None, factors: dict = None)[source]¶ Bases:
pyabc.distance.base.Distance
Use a weighted pnorm
\[d(x, y) = \left [\sum_{i} \left w_i ( x_iy_i ) \right^{p} \right ]^{1/p}\]to compute distances between sets of summary statistics. E.g. set p=2 to get a Euclidean distance.
 Parameters
p (float, optional (default = 2)) – p for pnorm. Required p >= 1, p = np.inf allowed (infinitynorm).
weights (dict, optional (default = 1)) – Weights. Dictionary indexed by time points. Each entry contains a dictionary of numeric weights, indexed by summary statistics labels. If None is passed, a weight of 1 is considered for every summary statistic. If no entry is available in weights for a given time point, the maximum available time point is selected. It is also possible to pass a single dictionary index by summary statistics labels, if weights do not change in time.
factors (dict, optional (default = 1)) – Scaling factors that the weights are multiplied with. The same structure applies as to weights. If None is passed, a factor of 1 is considered for every summary statistic. Note that in this class, factors are superfluous as everything can be achieved with weights alone, however in subclasses the factors can remain static while weights adapt over time, allowing for greater flexibility.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

__init__
(p: float = 2, weights: dict = None, factors: dict = None)[source]¶ Initialize self. See help(type(self)) for accurate signature.

static
format_dict
(w, t, sum_stat_keys, default_val=1.0)[source]¶ Normalize weight or factor dictionary to the employed format.

get_config
() → dict[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.

class
pyabc.distance.
PercentileDistance
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.distance.RangeEstimatorDistance
Calculate normalization 20% and 80% from percentiles as lower and upper margins

PERCENTILE
= 20¶ The percentiles

get_config
()[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict


class
pyabc.distance.
PoissonKernel
(ret_scale: str = 'SCALE_LOG', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
A kernel with a Poisson probability mass function.
 Parameters
keys, pdf_max (ret_scale,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
RangeEstimatorDistance
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.distance.DistanceWithMeasureList
Abstract base class for distance functions which estimate is based on a range.
It defines the two template methods
lower
andupper
.Hence
\[d(x, y) = \sum_{i \in \text{measures}} \left  \frac{x_i  y_i}{u_i  l_i} \right \]where \(l_i\) and \(u_i\) are the lower and upper margin for measure \(i\).

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

__init__
(measures_to_use='all')[source]¶ Initialize self. See help(type(self)) for accurate signature.

get_config
()[source]¶ Return configuration of the distance.
 Returns
config – Dictionary describing the distance.
 Return type
dict

initialize
(t: int, get_all_sum_stats: Callable[], List[dict]], x_0: dict = None)[source]¶ This method is called by the ABCSMC framework before the first use of the distance (at the beginning of ABCSMC.run()), and can be used to calibrate it to the statistics of the samples.
The default is to do nothing.
 Parameters
t (int) – Time point for which to initialize the distance.
get_all_sum_stats (Callable[[], List[dict]]) – Returns on command the initial summary statistics.
x_0 (dict, optional) – The observed summary statistics.


class
pyabc.distance.
SimpleFunctionDistance
(fun)[source]¶ Bases:
pyabc.distance.base.Distance
This is a wrapper around a simple function which calculates the distance. If a function/callable is passed to the ABCSMC class, which is not subclassed from pyabc.Distance, then it is converted to an instance of the SimpleFunctionDistance class.
 Parameters
fun (Callable[[dict, dict], float]) – A Callable accepting as parameters (a subset of) the arguments of the pyabc.Distance.__call__ function. Usually at least the summary statistics x and x_0. Returns the distance between both.

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
SimpleFunctionKernel
(fun: Callable, ret_scale: str = 'SCALE_LIN', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.kernel.StochasticKernel
This is a wrapper around a simple function which calculates the probability density.
 Parameters
fun (Callable) – A Callable accepting __call__’s parameters. The function should be a pdf or pmf.
keys, pdf_max (ret_scale,) –

__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float

class
pyabc.distance.
StochasticKernel
(ret_scale: str = 'SCALE_LIN', keys: List[str] = None, pdf_max: float = None)[source]¶ Bases:
pyabc.distance.base.Distance
A stochastic kernel assesses the similarity between observed and simulated summary statistics or data via a probability measure.
Note
The returned value cannot be interpreted as a distance function, but rather as an inverse distance, as it increases as the similarity between observed and simulated summary statistics increases. Thus, a StochasticKernel should only be used together with a StochasticAcceptor.
 Parameters
ret_scale (str, optional (default = SCALE_LIN)) – The scale of the value returned in __call__: Given a proability density p(x,x_0), the returned value can be either of p(x,x_0), or log(p(x,x_0)).
keys (List[str], optional) – The keys of the summary statistics, specifying the order to be used.
pdf_max (float, optional) – The maximum possible probability density function value. Defaults to None and is then computed as the density at (x_0, x_0), where x_0 denotes the observed summary statistics. Must be overridden if pdf_max is to be used in the analysis by the acceptor and the default is not applicable. This value should be in the scale specified by ret_scale already.

class
pyabc.distance.
ZScoreDistance
(measures_to_use='all')[source]¶ Bases:
pyabc.distance.distance.DistanceWithMeasureList
Calculate distance as sum of ZScore over the selected measures. The measured Data is the reference for the ZScore.
Hence
\[d(x, y) = \sum_{i \in \text{measures}} \left \frac{x_iy_i}{y_i} \right\]
__call__
(x: dict, x_0: dict, t: int = None, par: dict = None) → float[source]¶ Evaluate at time point t the distance of the summary statistics of the data simulated for the tentatively sampled particle to those of the observed data.
Abstract method. This method has to be overwritten by all concrete implementations.
 Parameters
x (dict) – Summary statistics of the data simulated for the tentatively sampled parameter.
x_0 (dict) – Summary statistics of the observed data.
t (int) – Time point at which to evaluate the distance. Usually, the distance will not depend on the time.
par (dict) – The parameters used to create the summary statistics x. These can be required by some distance functions. Usually, the distance will not depend on the parameters.
 Returns
distance – Quantifies the distance between the summary statistics of the data simulated for the tentatively sampled particle and of the observed data.
 Return type
float


pyabc.distance.
combined_mean_absolute_deviation
(data, x_0, **kwargs)[source]¶ Compute the sum of the mean absolute deviations to the mean of the samples and to the observed value.

pyabc.distance.
combined_median_absolute_deviation
(data, x_0, **kwargs)[source]¶ Compute the sum of the median absolute deviations to the median of the samples and to the observed value.

pyabc.distance.
mean_absolute_deviation
(data, **kwargs)[source]¶ Calculate the mean absolute deviation from the mean.

pyabc.distance.
mean_absolute_deviation_to_observation
(data, x_0, **kwargs)[source]¶ Mean absolute deviation of data w.r.t. the observation x_0.

pyabc.distance.
median_absolute_deviation
(data, **kwargs)[source]¶ Calculate the sample median absolute deviation (MAD) from the median, defined as median(abs(data  median(data)).

pyabc.distance.
median_absolute_deviation_to_observation
(data, x_0, **kwargs)[source]¶ Median absolute deviation of data w.r.t. the observation x_0.

pyabc.distance.
root_mean_square_deviation
(data, x_0, **kwargs)[source]¶ Square root of the mean squared error, i.e. of the bias squared plus the variance.

pyabc.distance.
span
(data, **kwargs)[source]¶ Compute the difference of largest and smallest data point.

pyabc.distance.
standard_deviation
(data, **kwargs)[source]¶ Calculate the sample standard deviation (SD).