Docker from an Operations perspective

Container technology in computing is quite old. You wouldn’t know it from the hype that surrounds it nowadays, but apparently I was already “doing containerization” back in around 2001 when I set up my first service-jail to host an LDAP server in FreeBSD. The fundamental principles even go back to the 60’s. The tech has come a long way since then, but progress has mostly been in the area of provisioning. I haven’t yet seen a good reason to integrate Docker or Kubernetes into my own workflow. Here’s why.

Containers, a definition

Before any kind of discussion becomes meaningful, let’s establish what I consider ‘containers’ to be, so you actually know what I’m talking about.

Containerization is a feature of computer operating systems. Many operating systems allow the owner to partition the resources of a host machine into multiple more or less independent compartments in which processes may be run, more or less isolated from other such compartments. A container represents a single such compartment out of potentially many.

You’ll notice that I used the phrase ‘more or less’ two times in the previous paragraph. That’s because containers historically come in many shapes and sizes. The defining properties of a container, to me, are:

  • A process within the container only ‘sees’ part of the host machine.
  • A process within the container isn’t necessarily aware of the above limitation.

You’ll see that the above definition is overly broad in that it covers anything from a simple but ancient chroot to a Docker-image running on Kubernetes.

Historical context

In order to understand modern container technology, it helps to have a bit of historical perspective on the matter.

My first personal experience with containers was on FreeBSD, which introduced Jails somewhere around the turn of the century in its 4.x-branch. We’re currently on 12.0 so it’s been a while. Jails were an addition to an earlier mechanism called chroot which is older than I am.

A chroot is a construct on UNIX-systems which allows the operator to trick any process into seeing an arbitrary directory on the system’s disk as if it were the root-node. Wait, what?

Every UNIX system’s file structure starts at /, which is the ‘root’ of the system’s file storage. All files and directories (folders, if you will) live either directly under the root-node or within a directory that branches down from it. For example /srv/jails/webserver would be a valid directory that would be three nodes down from the root-node.

Now a process chroot‘ed to /srv/www/webserver wouldn’t know better than that / would be the contents of /srv/www/webserver as seen from the host machine. It’s the most rudimentary form of containerization because a multitude of processes can be chroot‘ed to a multitude of directories and none of them would theoretically be able to know that the actual file system is a much bigger place than it’s made out to be.

Most containerization-techniques are what I’d consider ‘benign’. Generally it’s not a requirement to hide the fact that containerization is taking place from the processes running within the containers. This makes it fairly simple for a process running in a chroot to find out that something funny is going on. It is, however, quite hard for such a process to ‘break out’ of the chroot and access parts of the file system outside of its ‘jail’.

FreeBSD took the concept further and introduced similar partitioning to processes. If you were to chroot a shell process and look at the output of commands like top, you’d see all of the processes running on the system. Including those that weren’t chroot‘ed at all, or are running from a completely different chroot. Now if you were running this shell as the root-user, nothing would stop you from killing any of these other processes. So while a chroot effectively isolates the file system, it really does little else. That’s where Jails came in, and that’s where things get interesting.

Jails were the first containerization mechanism that ordinary people could use on their home PC’s running FreeBSD. Sure Solaris had Zones and all manner of mainframes had similar constructs from before TV went to colour, but those were prohibitively expensive. Jails permitted the owners of simple x86-PC’s to split their computers into near-autonomous virtual machines. A jailed process would have:

  • Its own view of the process table.
  • Its own view of the file system (like a chroot).
  • Its own set of users,groups and permissions.

Everything else would be shared. Sounds like Docker to you? That’s what I thought.. and it’s also why I never got the hype. Docker revolutionized containers because it made them dead-easy to provision, that’s all.

Why did Docker win?

Dead-easy provisioning may not seem like much of a big deal, but that’s until you realize how hard it is to manage a containerized environment by hand. Docker automates most of that, but very little else. It’s a bit why Microsoft won the PC desktop instead of IBM OS/2. If you weren’t into computers in the early 90’s and had to Google OS/2, you just proved my point. Suffice it to say: OS/2 was far ahead of Windows technologically. It just required a bit more effort to install and run than Windows 95 did.. and the rest is history.

Now when comparing FreeBSD Jails to Docker: Jails are hard because they have zero automation and permit the user complete freedom of choice. That’s all fine if you’re a BSD-greybeard but it kills products on the commercial marketplace. Docker, on the other hand, allowed developers to finally rid themselves of those same greybeards in Operations by being extemely easy for them to use.. and the rest is history.

VM’s versus Containers

Technologically there is nothing wrong with containers. As a business considering their use in production, you just need to be aware of a few key risks that are often forgotten. As you saw above, containers are all running inside a single shared host operating environment. Sure they are isolated, but the layers of isolation are (in some places) quite thin indeed.

If you intend to run containers in production you need to determine whether you trust especially the host operating system but also any parallel containers sufficiently. This trust touches on legal areas such as data protection and privacy compliance. In development environments these are usually non-issues, but production and especially cloud-environments such as AWS Fargate may be a different story.

This same issue was raised and hotly debated when VM’s where the next big thing, and the issue is still relevant in that space as well. Look into dedicated hardware/tenancy for your production environment if you’re dealing with sensitive data and shared-anything worries you.

The new threat from containers

The fundamental new issue with containers, especially Docker-based ones, is the same as why Windows is dangerous. Too many untrained (or worse: half-trained) developers are churning out Docker-images at an alarming rate these days.

As an old UNIX-greybeard I’m astounded by the number of images that just pull in all of some half-recent Ubuntu kitchen sink, install the app-of-the-day in there (as root), expose it to the network (still running as root) and leave it at that. Sure, when the box gets damaged in some way you just recycle it and go about your business from an availability standpoint, but security- and compliance-wise this is a huge nightmare waiting to happen.

Now as I said there is nothing wrong with containers from a technological standpoint, as long as good Operations-practices are maintained when designing container images. This is where I see a major risk in the deployment of containers. They’re dead-easy to provision, but that’s not the same as setting up a secure and reliable operating environment on the inside. Containers can and do get hacked. Data does get compromised from them. You may treat them like cattle, but they’re still your precious snowflake servers on the inside. Respect them!