Vagga may supervise multiple processes with single command. This is very useful for running multiple-component and/or networking systems.
By supervision we mean running multiple processes and watching until all of them
exit. Each process is run in it’s own container. Even if two processes share
the key named “container”, which means they share same root filesystem, they
run in different namespaces, so they don’t share
/proc and so on.
There are three basic modes of operation:
stop-on-failure– stops all processes as soon as any single one is dead (default)
wait-all-successful– waits until all successful processes finish
In any mode of operation supervisor itself never exits until all the children
are dead. Even when you kill supervisor with
kill -9 or
kill -KILL all
children will be killed with
-KILL signal too. I.e. with the help of
namespaces and good old
PR_SET_PDEATHSIG we ensure that no process left
when supervisor killed, no one is reparented to
init, all traces of running
containers are cleared. Seriously. It’s very often a problem with many other
ways to run things on development machine.
Stop on Failure¶
It’s not coincidence that
stop-on-failure mode is default. It’s very
useful mode of operation for running on development machine.
Let me show an example:
commands: run_full_app: !Supervise mode: stop-on-failure children: web: !Command container: python run: "python manage.py runserver" celery: !Command container: python run: "python manage.py celery worker"
Imagine this is a web application written in python (
web process), with
a work queue (
celery), which runs some long-running tasks in background.
When you start both processes
vagga run_full_app, often many log messages
with various levels of severity appear, so it’s easy to miss something. Imagine
you missed that celery is not started (or dead shortly after start). You go to
the web app do some testing, start some background task, and wait for it to
finish. After waiting for a while, you start suspect that something is wrong.
But celery is dead long ago, so skimming over recent logs doesn’t show up
anything. Then you look at processes: “Oh, crap, there is no celery”. This is
stop-on-failure you’ll notice that some service is down immediately.
In this mode vagga returns exit code of first process exited. And an
128+signal code when any other singal was sent to supervisor (and
propagated to other processes).
Wait All Successful¶
wait-all-successful mode vagga works same as in
mode, except processes that exit with exit code
0 (which is known as
sucessful error code) do not trigger failure condition, so other processes
continue to work. If any process exits on signal or with non-zero exit code
“failure” mode is switched on and vagga exits the same as in
This mode is intended for running some batch processing of multiple commands in multiple containers. All processes are run in parallel, like with other modes.
In this mode vagga returns exit code zero if all processes exited successfully
and exit code of the first failing process (or
128+signal if it was dead
by signal) otherwise.
Restarting a Subset Of Processes¶
Sometimes you may work only on one component, and don’t want to restart the whole bunch of processes to test just one thing. You may run two supervisors, in different tabs of a terminal. E.g:
# run everything, except the web process we are debugging $ vagga run_full_app --exclude web # then in another tab $ vagga run_full_app --only web
Then you can restart
web many times, without restarting everything.