Using Docker in Production

large_h-trans

Linux containers are around for quite some time and docker has built a nice toolsuite around the kernel-features for process-isolation (namespaces, cgroups, etc.). The isolation technology is part of the kernel for about 8 years now, so it probably can be considered mature. Big distributions used in commerical environments like Redhat and SUSE linux officially support docker (their packaged versions of it) and provide own base images (only downloadable in the subscriber portals). Also there are companies running huge docker clouds in their daily production business.

We already used Docker to setup our build-environment and create cheap test-containers, but now the plan was to use it also on some production machines.

I want to share some thoughts about docker in production and hopefully others share their experience through comments. This article applies to the scenario of a bigger traditional company. If you are part of a startup, the process may be much smoother because there is less scepticism towards new technologies but also because maybe security considerations are taken too lightly. Also this article does not focus on a company which has obvious big gain from using throwaway containers (like e.g. iron.io).

Restrictions in test or local build environments vs. production environments

Compared to local or test-environments, there are much more restrictions in production environments. In test/on their workstations developers often have vast freedom of tools and access to minimize impediments for the development workflow. But as soon as the software goes into production it has to comply to the much more restrictive production rules to be accepted by IT-security or operations department.

Here is a short comparison:

local or test-environment production environment
less restrictions regarding internet access strictly restricted access, mostly no access at all
less or no inspection of package sources packages have to come from a trusted source and content has to be traceable
freedom to choose arbitrary technologies specific defined supported software/setup
no monitoring required monitoring mandatory
local logging sufficent logservers to consolidate logs are common
backup often not needed backup mandatory
less hard requirements to security or performance (regarding configuration) configuration has to be secure and with optimal performance
developer driven operations driven
security updates are not enforced security updates have to be installed ASAP
run, delete, recreate containers as you like, throw away how you like stopping, deleting or recreating a container must be carefully planned into maintenance windows

Problems with default docker installation/workflow and mitigation

Docker makes it very easy for people to pull prebuilt + preconfigured images out of the docker registry. These images allow to set up software quickly and without in depth knowledge of the software which is used. When you are familiar with docker, you can setup a postgres database or a jenkins in minutes in a quality sufficient for development or testing. In production environments you have to ensure the safety of your customers data and you have to use existing infrastructure and processes for monitoring, logging, backup and even setting up the system.

production requirement default docker mitigation consequence
servers must not access the internet wants to pull images from dockerhub setup your own docker registry (e.g. Portus, see links below) you cannot pull anymore images from dockerhub, you could import them 1:1 in your own registry, but that is also not advisable
operating system must be supported by a vendor there are docker base-images for every linux distribution and depending on the gusto of the image-creator application-images (e.g. jenkins) are built on different distributions distributions offering commmerical support (e.g. Redhat, SUSE) provide docker-baseimages for their paying customers you have to rebuild all docker images using the officially supported base images. In most cases you will have to adjust the Dockerfile of an application image before to be compatible with
software have to be trustworthy  you don’t know what’s inside an image  get the dockerfile, understand what it does, rebuild the image with your trusted baseimage more or less complex depending on the application-image you have to analyze and rebuild
monitoring  run monitoring agent inside the container, or use hostbased monitoring tailoring of the Dockerfile/Monitoring necessary
logfiles STDOUT run logging agent (e.g. rsyslogd) in container, or use some mechanism on the host (e.g. the logspout container https://github.com/gliderlabs/logspout) you have to find a mechanism working for your production environment
backup most times you don’t want to store data to have throw-away-containers, but when you must (e.g. database), you have to use a classical backup tool tailor existing process for use with docker (not too difficult, but has to be done)
configuration ships with the default configuration made by the image maintainer adjust the configuration to your needs will probably take some time, so consider in planning
technology has to be approved by operations team docker is quite new, if the operations team in your company does not have experience with it they most definitively will reject it you need to convince the operations team to use the new technology, build a small sample case and take their objections serious will probably take some time, so consider in planning
security updates build a new updated base image, then rebuild all application images and also make sure updates for additional packages are received (normally automatic by fetching the newest version from the package manager) As in production it is recommendable anyway to have few or one distribution and controlled baseimages, it is easier to keep them up to date. That still involves rebuilding of all images. With arbitrary baseimages from the net, you probably will have a very hard job to keep them up to date. So consider the time you need for planning your update processes and rollout on the machines.
run, delete, recreate To change ports, volumes, environment variables, etc. of your container you have to bring it down and recreate it. That is no problem on dev/test, but in production. Data may get lost by human error (accidental delete of container-volumes, unmapped container-volumes, etc.) Do config changes in maintenance windows. Use your highavailability setup (if you have) to recreate one container at a time. Be careful not to destroy your data. Plan ahead. Give some thought to how you will handle such events and possible disaster recovery in case of data loss. Optimize your setup and documentation so human error is less likely (e.g. be aware of the different storage possibilities of docker and consequences of deleting a volume or an uncommitted container).

Problems we ran into

Firewalls

In production you normally encounter much more restrictive firewall rules (which is good 🙂 ), regarding pulling stuff from the internet or communicating between servers. Consider pushing docker packages into your local package repository and think about a scenario where you can’t pull images from the central registry. Pulling images created by (potentially harmful) strangers into production isn’t a good idea.

Docker Hub

The central docker-repository and offering paid services for private repositories is part of docker’s business model, so the docker-daemon is quite intertangled with the docker hub.

So you want to rely on some base images from docker hub but only a few hand selected ones. There is no easy way to get rid of the docker hub central registry. You can mirror it, but it will push all requests through. I have a problem with people being able to pull arbitrary images into production servers. You may want to allow images from docker, nginx, whatever big projects images, but not everyones. Or you want to rebuild the images on your own.

In the links on the bottom you find some tutorials how to run your own registry. Also there is Portus a docker registry developed by SUSE.

The only solution to keep control over your images would be to block traffic to the internet from the machines, setup an own registry, export the images you want to use from dockerhub and import them into your local registry. Then modify your dockerfiles to not rely on baseimages from dockerhub, but on the ones from your own registry.

What is inside a container?

So you setup a fully automated setup of servers with kickstart or VM image cloning containing all your precious base config. You are running some enterprise linux (e.g. redhat, suse) and pay for support to comply with business requirements. The production network doesn’t have internet access, but connects to your local rpm-mirror/repository (e.g. satellite).

And here comes docker. Suddenly you have a zoo of operatingsystems with unknown pre configuration. You actually don’t know what you are running anymore. Of course that can be fixed by creating your own base-image and only using that. But you have to consider that as well when using docker. That is probably only enforceable when you exclusively use your own local registry with only your handcrafted base-image available as source.

That of course also means rewriting of the prebuild docker images from dockerhub if they are based on a different OS flavor than you are using.

Handling the zoo

Soon you will have a whole bunch of docker images and need some form of distributing the right container-versions and startup commands across your infrastructure. You will also need a cleanup strategy to purge old images. Currently we use jenkins to roll out the images, but that also gets too fiddly soon. For bigger setups I would use traditional configuration management (e.g. Salt, Puppet, Chef, Ansible) or some more advanced docker cloud tools.

Configuration and knowledge about the software

Docker allows you to easily use software you do not know well. This is a gain for development, as most developers don’t need to know how to tune a database or secure a webserver (asuming both of which is running locally). But in production suddenly this matters a lot. So consider the time to tweak the configuration in your estimations for production use.

Devops

Radical devops philosophy would be, that developers prepare their software (e.g. Docker-Container) and run it on production. The admins build the tools around it and support them. Both teams work closely together and all involved people are equally responsible for the systems.

That is a nice theory, but besides the idea of closely working together and supporting each other I see some problems in practice. First there is specialization. Every member of the team has some special experience. Programmers can program software better than sysadmins, Sysadmins have a better knowledge about the infrastructure and necessities of running a production environment. Now if you throw together these two roles it won’t be useful. Even if you have persons who have both skills they will always see the problem from their current role’s point of view. It is just a matter of not having enough time. I can just not think about every eventuality of systemadministration AND develop good software. At some point you have to concentrate on one of the two.

Now if you want to use docker to let developers create containers and run these on production the mentioned problems won’t just disappear. You probably will end with a bunch of hard to manage containers which do not fit in your whole production concept.

On call duty

In the world of sysadmins everyone is used to have on-call-support and sees that the infrastructure is fit enough to not interfere too much with the private life. If real devops was done suddenly programmers would have to do on-call-support and would have to be fit to fix problems occuring. In my opinion that is vastly unrealistic.

Suggestion for teaming up

If you want to use docker and decided that it is worth it, then I would suggest that the sysadmin team gives the devs some basic rules. E.g. create for them the base-image, support them with adding monitoring/backup/logging, etc. . Containers are built by devs during development, but then before going into production are reviewed by sysadmins.

I would really split that into different registries and use the classical test-int-prod environments. Dev and Int would have the same registry.

  • Test: Devs have all freedom they need, but before moving to int they have to comply to the production standards
  • Int: Transfer of the work (images, etc. to sysadmins), intense reviewing and testing
  • Prod: Separate registry. Logically Linked to dev/int by a version control system (.e.g. versioning of the Dockerfiles), but totally independent

Conclusion

During development docker makes your life much easier, but that does not mean it can be used in the same manner on production. So if you are aware of the technical and social obstacles and have the time and management backup to overcome them you can start introducing the next level of automation to your production environment. If a company does not even have a configuration management tool in use and the necessary administrative processes established, I personally would not consider using docker in production.

Also consider if you really have gains from using docker. iron.io is a very interesting example, where they benefit hugely from docker, because their cloud service relies on locked-down throwaway environments with minimal overhead. In a more traditional company where you run a bunch of servers under your control with almost the same software all the time and already use a configuration management tools the benefit is not so huge and the additional complexity may not be worth it and do harm to your security and availability.

Some links

Some websites I explored during my research:

Support in distributions with commerical support

Running your own docker registry

Docker in production

 

pixelstats trackingpixel

Leave a comment

Your comment

Time limit is exhausted. Please reload the CAPTCHA.