# Parameter inference¶

This example illustrates parameter inference for a single model. (Check also the model selection example if you’re interested in comparing multiple models.)

This notebook can be downloaded here:
`Parameter Inference`

.

We’re going to use the following classes from the pyABC package:

`ABCSMC`

, our entry point to parameter inference,`RV`

, to define the prior over a single parameter,`Distribution`

, to define the prior over a possibly higher dimensional parameter space,`MultivariateNormalTransition`

, to do a kernel density estimate (KDE) for visualization purposes.

Let’s start to import the necessary classes. We also set up matplotlib and we’re going to use pandas as well.

```
In [1]:
```

```
from pyabc import (ABCSMC, Distribution, RV,
MultivariateNormalTransition)
import scipy as sp
import scipy.stats as st
import tempfile
import os
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
```

Our model is about as simple as it gets. We assume a Gaussian model \(\mathcal{N}(\mathrm{mean}, 1)\) with the single parameter `mean`

.
The variance is 1.
In this case, the parameter dictionary that is passed to the model has only the single key `mean`

.
We name the sampled data just `data`

. It might seem like overcomplicating things to return a whole dictionary, but as soon as we return heterogeneous data this starts to make a lot of sense.

```
In [2]:
```

```
def model(parameter):
return {"data": parameter["mean"] + sp.randn()}
```

We then define the prior over the `mean`

to be uniform over the interval \([0, 5]\).

```
In [3]:
```

```
prior = Distribution(mean=RV("uniform", 0, 5))
```

(Actually, this has to be read as [0, 0+5]. For example, `RV("uniform", 1, 5)`

is uniform over the interval
[1,6]. Check the scipy.stats package for details of the definition.)
We also need to express when we consider data to be close in form of a distance funtion.
We just take the absolute value of the difference here.

```
In [4]:
```

```
def distance(x, y):
return abs(x["data"] - y["data"])
```

Now we create the ABCSMC object, passing the model, the prior and the distance to it.

```
In [5]:
```

```
abc = ABCSMC(model, prior, distance)
```

To get going, we have to specify where to log the ABC-SMC runs. We can later query the database with the help of the
`History`

class.
Usually you would now have some measure data which you want to know the posterior of.
Here, we just assume, that the measured data was 2.5.

```
In [6]:
```

```
db_path = ("sqlite:///" +
os.path.join(tempfile.gettempdir(), "test.db"))
observation = 2.5
abc.new(db_path, {"data": observation})
```

```
INFO:Epsilon:initial epsilon is 1.2393604085164727
INFO:History:Start <ABCSMC(id=5, start_time=2018-04-11 08:15:04.293824, end_time=None)>
```

```
Out[6]:
```

```
5
```

The `new`

method returned an integer. This is the id of the ABC-SMC run.
This id is only important if more than one ABC-SMC run is stored in the same database.

Let’s start the sampling now. We’ll sample until the acceptance threshold epsilon drops below 0.2.
We also specify that we want a maximum number of 10 populations.
So whatever is reached first, `minimum_epsilon`

or `max_nr_populations`

will stop further sampling.
For the simple model we defined above, this should only take a couple of seconds.

```
In [7]:
```

```
history = abc.run(minimum_epsilon=.2, max_nr_populations=10)
```

```
INFO:ABC:t:0 eps:1.2393604085164727
INFO:ABC:t:1 eps:0.5937826556325588
INFO:ABC:t:2 eps:0.3472986615093911
INFO:ABC:t:3 eps:0.18389361913444088
INFO:History:Done <ABCSMC(id=5, start_time=2018-04-11 08:15:04.293824, end_time=2018-04-11 08:15:42.121007)>
```

The `History`

object returned by ABCSMC.run can be used to query the database.
This object is also available via abc.history.

```
In [8]:
```

```
history is abc.history
```

```
Out[8]:
```

```
True
```

Now we visualize the probability density functions. The vertical line indicates the location of the observation. Given our model, we expect the mean to be close to the observed data.

```
In [9]:
```

```
from pyabc.visualization import plot_kde_1d
fig, ax = plt.subplots()
for t in range(history.max_t+1):
df, w = history.get_distribution(m=0, t=t)
plot_kde_1d(df, w,
xmin=0, xmax=5,
x="mean", ax=ax,
label="PDF t={}".format(t))
ax.axvline(observation, color="k", linestyle="dashed");
ax.legend();
```

That’s it. Now you can go ahead and try more sophisticated models.