Looking through Docker Hub at numerous Dockerfiles and seeing how Docker is generally used, it's easy to see it's commonly being mis-used or giving a false sense of security. What we think is a contained, secure environment might be anything but that. Here are some common mistakes to avoid and good practises to follow.
Beware the "docker" group
Be very careful when adding host users to the `docker` group. The docker daemon has root privileges so adding a user to the `docker` group is akin to giving a user sudo access without requiring a password. For example you can run:
It's a huge security risk, even for trusted users (should you leave your keyboard unlocked, or get exploited). Don't be lazy, stick to requiring `sudo` for using docker, or customise your `/etc/sudoers` file allowing `NOPASSWD` for safe (non-destructive) docker commands but still require it for destructive commands.
In similar regards to the previous point, don't run your containers using the root user. While you may think your container is isolated, it's a huge risk allowing your container to run with root privileges in case it gets exploited. Would you run your own apps (or other people's code) on the host using root? I hope not...you shouldn't treat your containers any differently.
Restarting containersIf you're restarting containers, you're doing it wrong. Always `run` a container, stop it, remove it, then run a fresh one if needing to start it again. It ensures your containers can be pulled down and built back up again easily and quickly and aren't storing state.
Avoid `latest`While latest is great for getting the most recent changes, in production it's a bit of a risk as at the point of pulling it will simply pull whatever is the latest. It's far better to specify the version you want in the run command that you execute as part of the deploy.
For production-standard containers (I consider this a must) you should be running them in read-only mode. If your app being contained is compromised, it stops the hacker putting any exploits on the filesystem. If you do need some writable access, there are two options:
- `--tmpfs /tmp` - use a temporary file system
- `-v /foo:/foo - mount a volume. I use this when I need to write to a data directory, but want everything else locked down as read-only
Read only mounts
Further to the above, if you do need to mount a volume, you can make it read only too using `:ro`, for example `-v /foo:/foo:ro`. A good example is my use of MiniDLNA. I mount my music and videos as volumes in my container, but MiniDLNA doesn't need write access to those volumes so I mark them as `:ro`.
Managing init (PID1)
While not needed for all situations, if you find your entrypoint (for example a Java app) is doing one of the following...
- Is not reaping spawned zombie processes
- Is not respecting or doesn't have signal handlers set up
Managing mount ownershipHandling file ownership between the host and container can be tricky. Linux under the hood, uses User Id's for file ownership (usernames are mapped to UID's). You ideally want the UID's to match between container and host rather than having the container chown-ing files/folders during start up. There are a few simply solutions:
Matching UID'sThis is my favourite and most secure of all the options. When building an image, and adding a non root user to run as, specify a random UID (rather than defaulting to 1000), for example 4999. Push the image to your registry as per normal. Now hosts can create a user with a matching UID...
Matching usernameAlternatively if you don't want to play around with UID's, you simply need a user with a matching username. For example building a container that runs as the user `git`. You can mount `/etc/passwd` into the container, so the host and container user will lookup to the same UID. However, this does expose your hosts `/etc/passwd`.
Specifying Id'sA final option is being able to specify the Id's when running the container. Using `usermod` in the `shadow` package in Alpine you can modify the user the container is running at on start up. The biggest compromise with this, is that you can't run in read-only mode as you are modifying the file system (`/etc/passwd`) at runtime and is the reason I don't use this method very often.
Keep it small
For many of us this will mean using alpine as a base image rather than larger images like Ubuntu or Debian although the more hardcore may start from `scratch`. Not only does using a smaller image speed up pulls and deploys, but having a lighter image means fewer default processes running and a smaller attack surface. Unless you absolutely can't (in some cases Alpine can be tricky to get running, mainly in my experience from running `musl` as opposed to `glibc`) you should be using Alpine.
Also be aware of the all the layers being created in your image. `docker history myimage` should show you all the layers your image is made up of.