Resuming stored ABC runs

In this examle, it is illustrated how stored ABC runs can be loaded and continued later on. This might make sense if you decide later on to run a couple more populations for increased accuracy.

The models used in this example are similar to the ones from the parameter inference tutorial.

This notebook can be downloaded here: Resuming stored ABC runs.

In this example, we’re going to use the following classes:

  • ABCSMC, our entry point to parameter inference,
  • RV, to define the prior over a single parameter,
  • Distribution, to define the prior over a possibly higher dimensional parameter space,

Let’s start with the imports.

In [1]:
from pyabc import ABCSMC, Distribution, RV
import scipy as sp
from tempfile import gettempdir
import os
/home/docs/checkouts/readthedocs.org/user_builds/pyabc/conda/latest/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/docs/checkouts/readthedocs.org/user_builds/pyabc/conda/latest/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/docs/checkouts/readthedocs.org/user_builds/pyabc/conda/latest/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
/home/docs/checkouts/readthedocs.org/user_builds/pyabc/conda/latest/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)

As usually, we start with the definition of the model, the prior and the distance function.

In [2]:
def model(parameter):
    return {"data": parameter["mean"] + sp.randn()}

prior = Distribution(mean=RV("uniform", 0, 5))

def distance(x, y):
    return abs(x["data"] - y["data"])

db = "sqlite:///" + os.path.join(gettempdir(), "test.db")

We next make a new ABC-SMC run and also print the id of this run. We’ll use the id later on to resume the run.

In [3]:
abc = ABCSMC(model, prior, distance)
run_id = abc.new(db, {"data": 2.5})
print("Run ID:", run_id)
INFO:History:Start <ABCSMC(id=1, start_time=2018-09-14 12:24:50.756643, end_time=None)>
INFO:Epsilon:initial epsilon is 1.3968799525637423
Run ID: 1

We then run up to 3 generations, or until the acceptance threshold 0.1 is reached – whatever happens first.

In [4]:
history = abc.run(minimum_epsilon=.1, max_nr_populations=3)
INFO:ABC:t:0 eps:1.3968799525637423
INFO:ABC:t:1 eps:0.6603616373074999
INFO:ABC:t:2 eps:0.3083683984302348
INFO:History:Done <ABCSMC(id=1, start_time=2018-09-14 12:24:50.756643, end_time=2018-09-14 12:24:53.903752)>

Let’s verify that we have 3 populations.

In [5]:
history.n_populations
Out[5]:
3

We now create a completely new ABCSMC object. We pass the same model, prior and distance from before.

In [6]:
abc_continued = ABCSMC(model, prior, distance)

Note

You could actually pass different models, priors and distance functions here. This might make sense if, for example, in the meantime you came up with a more efficient model implementation or distance function.

For the experts: under certain circumstances it can even be mathematically correct to change the prior after a couple of populations.

To resume a run, we use the load method. This loads the necessary data. We pass to this method the id of the run we want to continue.

In [7]:
abc_continued.load(db, run_id)
INFO:Epsilon:initial epsilon is 0.10974743573006081
Out[7]:
1
In [8]:
abc_continued.run(minimum_epsilon=.1, max_nr_populations=1)
INFO:ABC:t:3 eps:0.10974743573006081
INFO:History:Done <ABCSMC(id=1, start_time=2018-09-14 12:24:50.756643, end_time=2018-09-14 12:24:56.974263)>
Out[8]:
<pyabc.storage.history.History at 0x7ff0da1c9cc0>

Let’s check the number of populations of the resumed run. It should be 4, as we did 3 populations before and added another one.

In [9]:
abc_continued.history.n_populations
Out[9]:
4

That’s it. This was a basic tutorial on how to continue stored ABC-SMC runs.

Note

For advanced users:

In situations where the distance function or epsilon require initialization, it is possible that resuming a run via load(), we lose information because not everything can be stored in the database. This concerns hyper-parameters in individual objects specified by the user.

If that is the case, however the user can somehow store e.g. the distance function used in the first run, and pass this very object to abc_continued. Then it is ideally fully initialized, so that setting distance_function.require_initialize = False, it is just as if the first run had not been interrupted.

However, even if information was lost, after load() the process usually quickly re-adjusts itself in 1 or 2 iterations, so that this is not much of a problem.