5. Frequently asked questions (FAQ)¶
- Where did the name Charliecloud come from?
- How do you spell Charliecloud?
- My app needs to write to
/var/log
,/run
, etc. - Tarball build fails with “No command specified”
--uid 0
lets me read files I can’t otherwise!- Why is
/bin
being added to my$PATH
? - How does setuid mode work?
ch-run
fails with “can’t re-mount image read-only”- Which specific
sudo
commands are needed?
5.1. Where did the name Charliecloud come from?¶
Charlie — Charles F. McMillan was director of Los Alamos National Laboratory from June 2011 until December 2017, i.e., at the time Charliecloud was started in early 2014. He is universally referred to as “Charlie” here.
cloud — Charliecloud provides cloud-like flexibility for HPC systems.
5.2. How do you spell Charliecloud?¶
We try to be consistent with Charliecloud — one word, no camel case. That is, Charlie Cloud and CharlieCloud are both incorrect.
5.3. My app needs to write to /var/log
, /run
, etc.¶
Because the image is mounted read-only by default, log files, caches, and other stuff cannot be written anywhere in the image. You have three options:
- Configure the application to use a different directory.
/tmp
is often a good choice, because it’s shared with the host and fast. - Use
RUN
commands in your Dockerfile to create symlinks that point somewhere writeable, e.g./tmp
, or/mnt/0
withch-run --bind
. - Run the image read-write with
ch-run -w
. Be careful that multiple containers do not try to write to the same image files.
5.4. Tarball build fails with “No command specified”¶
The full error from ch-docker2tar
or ch-build2dir
is:
docker: Error response from daemon: No command specified.
You will also see it with various plain Docker commands.
This happens when there is no default command specified in the Dockerfile or
any of its ancestors. Some base images specify one (e.g., Debian) and others
don’t (e.g., Alpine). Docker requires this even for commands that don’t seem
like they should need it, such as docker create
(which is what trips
up Charliecloud).
The solution is to add a default command to your Dockerfile, such as
CMD ["true"]
.
5.5. --uid 0
lets me read files I can’t otherwise!¶
Some permission bits can give a surprising result with a container UID of 0. For example:
$ whoami
reidpr
$ echo surprise > ~/cantreadme
$ chmod 000 ~/cantreadme
$ ls -l ~/cantreadme
---------- 1 reidpr reidpr 9 Oct 3 15:03 /home/reidpr/cantreadme
$ cat ~/cantreadme
cat: /home/reidpr/cantreadme: Permission denied
$ ch-run /var/tmp/hello cat ~/cantreadme
cat: /home/reidpr/cantreadme: Permission denied
$ ch-run --uid 0 /var/tmp/hello cat ~/cantreadme
surprise
At first glance, it seems that we’ve found an escalation – we were able to read a file inside a container that we could not read on the host! That seems bad.
However, what is really going on here is more prosaic but complicated:
- After
unshare(CLONE_NEWUSER)
,ch-run
gains all capabilities inside the namespace. (Outside, capabilities are unchanged.) - This include
CAP_DAC_OVERRIDE
, which enables a process to read/write/execute a file or directory mostly regardless of its permission bits. (This is why root isn’t limited by permissions.) - Within the container,
exec(2)
capability rules are followed. Normally, this basically means that all capabilities are dropped whench-run
replaces itself with the user command. However, if EUID is 0, which it is inside the namespace given--uid 0
, then the subprocess keeps all its capabilities. (This makes sense: if root creates a new process, it stays root.) CAP_DAC_OVERRIDE
within a user namespace is honored for a file or directory only if its UID and GID are both mapped. In this case,ch-run
mapsreidpr
to containerroot
and groupreidpr
to itself.- Thus, files and directories owned by the host EUID and EGID (here
reidpr:reidpr
) are available for all access withch-run --uid 0
.
This isn’t a problem. The quirk applies only to files owned by the invoking
user, because ch-run
is unprivileged outside the namespace, and thus
he or she could simply chmod
the file to read it. Access inside and
outside the container remains equivalent.
References:
5.6. Why is /bin
being added to my $PATH
?¶
Newer Linux distributions replace some root-level directories, such as
/bin
, with symlinks to their counterparts in /usr
.
Some of these distributions (e.g., Fedora 24) have also dropped /bin
from the default $PATH
. This is a problem when the guest OS does not
have a merged /usr
(e.g., Debian 8 “Jessie”).
While Charliecloud’s general philosophy is not to manipulate environment
variables, in this case, guests can be severely broken if /bin
is not
in $PATH
. Thus, we add it if it’s not there.
Further reading:
5.7. How does setuid mode work?¶
As noted above, ch-run
has a transition mode that uses setuid-root
privileges instead of user namespaces. The goal of this mode is to let sites
evaluate Charliecloud even on systems that do not have a Linux kernel that
supports user namespaces. We plan to remove this code once user namespaces are
more widely available, and we encourage sites to use the unprivileged,
non-setuid mode in production.
We haven taken care to (1) drop privileges temporarily upon program start and
only re-acquire them when needed and (2) drop privileges permanently before
executing user code. In order to reliably verify the latter, ch-run
in
setuid mode will refuse to run if invoked directly by root.
It may be better to use capabilities and setcap rather than setuid. However, this also relies on newer features, which would hamper the goal of broadly available testing. For example, NFSv3 does not support extended attributes, which are required for setcap files.
Dropping privileges safely requires care. We follow the recommendations in “Setuid demystified” as well as the system call ordering and privilege drop verification recommendations of the SEI CERT C Coding Standard.
We do not worry about the Linux-specific fsuid
and fsgid
,
which track euid
/egid
unless specifically changed, which we
don’t do. Kernel bugs have existed that violate this invariant, but none are
recent.
5.8. ch-run
fails with “can’t re-mount image read-only”¶
Normally, ch-run
re-mounts the image directory read-only within the
container. This fails if the image resides on certain filesystems, such as NFS
(see issue #9). There are
two solutions:
- Unpack the image into a different filesystem, such as
tmpfs
or local disk. Consult your local admins for a recommendation. Note thattmpfs
is a lot faster than Lustre. - Use the
-w
switch to leave the image mounted read-write. Note that this has may have an impact on reproducibility (because the application can change the image between runs) and/or stability (if there are multiple application processes and one writes a file in the image that another is reading or writing).
5.9. Which specific sudo
commands are needed?¶
For running images, sudo
is not needed at all.
For building images, it depends on what you would like to support. For example, do you want to let users build images with Docker? Do you want to let them run the build tests?
We do not maintain specific lists, but you can search the source code and
documentation for uses of sudo
and $DOCKER
and evaluate them
on a case-by-case basis. (The latter includes sudo
if needed to invoke
docker
in your environment.) For example:
$ find . \( -type f -executable \
-o -name Makefile \
-o -name '*.bats' \
-o -name '*.rst' \
-o -name '*.sh' \) \
-exec egrep -H '(sudo|\$DOCKER)' {} \;