Parallel job execution on an SGE cluster environment¶
The pyabc.sge package provides as most important class
automatically parallelizes across an SGE/UGE cluster.
The SGE class can be used in standalone mode or in combination
with the ABCSMC class (see below Usage notes).
Usage of the parallel package is fairly easy. For example:
from pyabc.sge import SGE sge = SGE(priority=-200, memory="3G") def f(x): return x * 2 tasks = [1, 2, 3, 4] result = sge.map(f, tasks) print(result)
[2, 4, 6, 8]
The job scheduling is either done via an SQLite database or a REDIS instance. REDIS is recommended as it works more robustly, in particular in cases where distributed file systems are rather slow.
A configuration file in
~/.parallel is required.
pyabc.sge.sge_available can be used to check if an SGE cluster can be used on the machine.
Check the API documentation for more details.
Information about running jobs¶
python -m pyabc.sge.job_info_redis to get a nicely formatted output
of the current execution state, in case the REDIS mode is used.
python -m pyabc.sge.job_info_redis --help for more details.
SGE class can be used in standalone mode for
convenient parallelization of jobs across a cluster, completely independent
of the rest of the pyABC package.
SGE class can also be combined, for instance, with
pyabc.sampler.MappingSampler class for simple parallelization
of ABC-SCM runs across an SGE cluster.