-
Notifications
You must be signed in to change notification settings - Fork 452
Docker apps
In this context, words like 'app' have many possible meanings. To avoid confusion we use these terms with specific meanings:
-
"BOINC app" and "BOINC app version": the BOINC concepts described here.
-
"Science app": a set of programs that execute a job, i.e. that process input files and produce output files.
-
Science apps can evolve over time Each instance is a "science app version'.
-
"Science executable": a part of a science app version in compiled form, e.g. a x64/Linux executable.
BOINC lets you use Docker to run science apps on volunteer hosts (Win, Mac and Linux). To do so:
-
Develop your science app in the software environment of your choice (say, particular versions of Linux and Python, with particular libraries and packages installed).
-
Write a Dockerfile that builds this environment.
Note: your Docker image must include the ps
command.
Most Docker Linux images do, but for some reason the Debian image does not.
If you use this image you'll need to include:
RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/*
- Create BOINC app versions that combine the Dockerfile, and your science executables, with a "Docker wrapper" program (supplied by BOINC) that interfaces between Docker and the BOINC client.
Your science application can then run on all major platforms (Linux, Windows, Mac OS). In that sense it's similar to BOINC's support for apps that run in VirtualBox virtual machines. However, the Docker approach has several advantages:
- Docker apps can access GPUs.
- Docker apps use much less disk space (tens of MBs rather than GBs).
- Starting a Docker container takes less time than starting a virtual machine.
The remainder of this document describes BOINC's support for Docker apps. For a simple example, see the Docker app cookbook.
The Docker wrapper (docker_wrapper
) interfaces BOINC to Docker.
It is the main program of Docker apps.
Usage:
docker_wrapper [options] arg1 arg2 ...
Options:
--verbose
: write Docker commands and output to stderr.
--config <filename>
: config file name; default job.toml
.
--dockerfile <filename>
: Dockerfile name; default Dockerfile
.
--sporadic
: the application is sporadic.
The Docker wrapper reads an optional config file, default job.toml
.
This file, which is in TOML format,
can contain the following items:
project_dir_mount = "/project"
Mounts the job's project directory at the given mount point (an absolute path) in the container.
use_gpu = true
Allow GPU access from the container.
checkpoint_interval = 3600
Specify a checkpoint interval, overriding the computing preferences.
Unparsed cmdline args to docker_wrapper
are passed into
the container in an environment variable ARGS
.
To use this feature, include in your Dockerfile
ENV ARGS ""
CMD <cmd> ${ARGS}
Programs in the container can then access the arguments via
the environment variable ARGS
.
For example, if the main program is a bash script
it could do
...
./program $ARGS infile outfile
docker_wrapper
mounts the job's slot directory
at WORKDIR in the container.
So there are two ways to access an input file.
Mark the file as <copy_file/>
in the input template.
The BOINC client will copy the file to the slot directory
(with its logical name)
and the science executable can access it directly.
If your science app has large input files (100 MB+) you can avoid the space and time overhead of copying them to the slot directory by accessing the file in the project directory.
To do this, don't mark the file as <copy_file/>
.
The client will create a "link file" in the slot directory.
The link file is an XML document that points to
the file in the project directory; for example
<soft_link>../../projects/proj_url/infile</soft_link>
Mount the project directory in the container
by adding this to job.toml
:
project_dir_mount = "/project"
Your executables (in the container) must convert BOINC's link files to physical names. This is easy to do in a shell script:
#! /bin/bash
resolve () {
sed 's/<soft_link>..\/..\/projects\/[^\/]*\//\/project\//; s/<\/soft_link>//' $1 | tr -d '\r\n'
}
./worker $(resolve in) out
Here, the resolve()
function takes the
name of a link file and returns the path of the file
in the project directory
(assuming that this directory is mounted at /project
).
Output files should not be marked <copy_file/>
.
Write them (with logical names) in the WORKDIR.
A BOINC job has
- A BOINC app version.
- A workunit.
Each of these is a collection of files. The files in a BOINC app version are code-signed. This is normally done manually, preventing hackers from using your project to distribute malware even if they are able to break into your server.
The files in a BOINC app version are cached on the client.
They are deleted only when the app version has been superceded
by a later version.
Workunit files are deleted after a job is finished,
unless they are marked as <sticky/>
in the job's input template.
The files of a Docker app can be divided between app version and workunit in two ways.
In this model, there is one BOINC app per science app. The BOINC app version for a platform contains
- Dockerfile
- docker_wrapper (compiled for that platform)
- job.toml
- science executables
and each workunit contains
- input files
To deploy a new science application you need to create a new BOINC app, and to deploy a new science application version you need to create a new BOINC app version. These both require login access to the BOINC server.
In this model, a single BOINC handles multiple science apps.
This app can have app versions for different platforms.
Each app version contains
docker_wrapper
compiled for that platform.
Each workunit includes
- Dockerfile
- science executables
- job.toml (config file for
docker_wrapper
) - input files
This model facilitates interfaces where job submitters can deploy new science app versions or new science apps without server login access. But it also has some limitations.