# Parameter inference¶

This example illustrates parameter inference for a single model. (Check also the model selection example if you’re interested in comparing multiple models.)

This notebook can be downloaded here:
`Parameter Inference`

.

We’re going to use the following classes from the pyABC package:

`ABCSMC`

, our entry point to parameter inference,`RV`

, to define the prior over a single parameter,`Distribution`

, to define the prior over a possibly higher dimensional parameter space,`MultivariateNormalTransition`

, to do a kernel density estimate (KDE) for visualization purposes.

Let’s start to import the necessary classes. We also set up matplotlib and we’re going to use pandas as well.

```
[1]:
```

```
import pyabc
import numpy as np
import scipy.stats as st
import tempfile
import os
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
```

Our model is about as simple as it gets. We assume a Gaussian model \(\mathcal{N}(\mathrm{mean}, 0.5^2)\) with the single parameter \(\mathrm{mean}\). The variance is \(0.5^2\). In this case, the parameter dictionary that is passed to the model has only the single key `mean`

. We name the sampled data just `data`

. It might look like overcomplicating things to return a whole dictionary, but as soon as we return heterogeneous data this starts to make a lot of sense.

```
[2]:
```

```
def model(parameter):
return {"data": parameter["mean"] + 0.5 * np.random.randn()}
```

We then define the prior for the `mean`

to be uniform over the interval \([0, 5]\):

```
[3]:
```

```
prior = pyabc.Distribution(mean=pyabc.RV("uniform", 0, 5))
```

(Actually, this has to be read as \([0, 0+5]\). For example, `RV("uniform", 1, 5)`

is uniform over the interval \([1,6]\). Check the `scipy.stats`

package for details of the definition.)

We also need to specify when we consider data to be close in form of a distance funtion. We just take the absolute value of the difference here.

```
[4]:
```

```
def distance(x, y):
return abs(x["data"] - y["data"])
```

Now we create the `ABCSMC`

object, passing the model, the prior and the distance to it.

```
[5]:
```

```
abc = pyabc.ABCSMC(model, prior, distance)
```

To get going, we have to specify where to log the ABC-SMC runs.

We can later query the database with the help of the `History`

class.

Usually you would now have some measure data which you want to know the posterior of. Here, we just assume, that the measured data was 2.5.

```
[6]:
```

```
db_path = ("sqlite:///" +
os.path.join(tempfile.gettempdir(), "test.db"))
observation = 2.5
abc.new(db_path, {"data": observation})
```

```
INFO:History:Start <ABCSMC(id=1, start_time=2020-01-13 17:20:35.575163, end_time=None)>
```

```
[6]:
```

```
<pyabc.storage.history.History at 0x7f9b70fceba8>
```

The `new`

method returned an integer. This is the id of the ABC-SMC run. This id is only important if more than one ABC-SMC run is stored in the same database.

Let’s start the sampling now. We’ll sample until the acceptance threshold epsilon drops below 0.2. We also specify that we want a maximum number of 10 populations. So whatever is reached first, `minimum_epsilon`

or `max_nr_populations`

, will stop further sampling.

For the simple model we defined above, this should only take a couple of seconds:

```
[7]:
```

```
history = abc.run(minimum_epsilon=.1, max_nr_populations=10)
```

```
INFO:ABC:Calibration sample before t=0.
INFO:Epsilon:initial epsilon is 1.319732537446763
INFO:ABC:t: 0, eps: 1.319732537446763.
INFO:ABC:Acceptance rate: 100 / 183 = 5.4645e-01, ESS=1.0000e+02.
INFO:ABC:t: 1, eps: 0.6266780218276696.
INFO:ABC:Acceptance rate: 100 / 212 = 4.7170e-01, ESS=9.5391e+01.
INFO:ABC:t: 2, eps: 0.37028667452653585.
INFO:ABC:Acceptance rate: 100 / 293 = 3.4130e-01, ESS=7.0363e+01.
INFO:ABC:t: 3, eps: 0.14892234681938504.
INFO:ABC:Acceptance rate: 100 / 713 = 1.4025e-01, ESS=7.6078e+01.
INFO:ABC:t: 4, eps: 0.07515483935316057.
INFO:ABC:Acceptance rate: 100 / 1255 = 7.9681e-02, ESS=9.2198e+01.
INFO:ABC:Stopping: minimum epsilon.
INFO:History:Done <ABCSMC(id=1, start_time=2020-01-13 17:20:35.575163, end_time=2020-01-13 17:20:41.845830)>
```

The `History`

object returned by ABCSMC.run can be used to query the database.
This object is also available via abc.history.

```
[8]:
```

```
history is abc.history
```

```
[8]:
```

```
True
```

Now we visualize the probability density functions. The vertical line indicates the location of the observation. Given our model, we expect the mean to be close to the observed data.

```
[9]:
```

```
fig, ax = plt.subplots()
for t in range(history.max_t+1):
df, w = history.get_distribution(m=0, t=t)
pyabc.visualization.plot_kde_1d(
df, w,
xmin=0, xmax=5,
x="mean", ax=ax,
label="PDF t={}".format(t))
ax.axvline(observation, color="k", linestyle="dashed");
ax.legend();
```

pyABC also offers various other visualization routines in order to analyze the parameter estimation run:

```
[10]:
```

```
_, arr_ax = plt.subplots(2, 2)
pyabc.visualization.plot_sample_numbers(history, ax=arr_ax[0][0])
pyabc.visualization.plot_epsilons(history, ax=arr_ax[0][1])
pyabc.visualization.plot_credible_intervals(
history, levels=[0.95, 0.9, 0.5], ts=[0, 1, 2, 3, 4],
show_mean=True, show_kde_max_1d=True,
refval={'mean': 2.5}, arr_ax=arr_ax[1][0])
pyabc.visualization.plot_effective_sample_sizes(history, ax=arr_ax[1][1])
plt.gcf().set_size_inches((12, 8))
plt.gcf().tight_layout()
```

That’s it. Now you can go ahead and try more sophisticated models.