Provisioning Docker with Puppet

Todd Jacobs

Issue #272, December 2016

Docker containers are great, but Docker hosts and instances still need to be managed. Configuration management tools like Puppet can work hand in hand with Docker, and their powerful domain-specific languages (DSLs) make light work of things that are tricky or impossible to do in Docker itself.

DevOps and containerization are currently all the rage in agile IT shops. However, there are valid differences of opinion about the appropriate demarcation between containerization and configuration management technologies in a well designed DevOps toolchain. Both technologies have specific use cases, although there is also a great deal of overlap between them. In many cases, the technologies complement one another and are intended to work together. This article focuses on building synergy between Docker and Puppet, and shows how the two technologies can work together to provide a more robust DevOps environment than either tool can manage alone.

One way of looking at Docker is as a modern take on the IT practice of deploying “golden images” onto a server. In the simplest cases, Docker images bundle up a service and its runtime dependencies into an easily deployable, self-contained unit. A static image is a great solution for reliably deploying services to the data center, but on the other hand, the lack of configurability can lead to needlessly bloated service catalogs as new images are built for each different hard-coded configuration.

Docker supports a limited number of features for enabling configuration changes at build time and runtime, but actively maintaining complex configurations is not what Docker is designed to do. You certainly can tweak Docker images with environment variables, command-line options, layered images, Docker volumes with configuration data or custom scripting, but all of those options come at the cost of additional complexity. Without a robust domain-specific language and ongoing convergence to a defined state, using Docker features to support dynamic configurations often leads to the very problems that dedicated configuration management tools like Puppet were designed to solve.

Why focus on just Docker and Puppet, when there are other tools in that space? Although there certainly are alternatives, when one thinks of containerization, Docker is far and away the current mindshare leader. The race is a little tighter in the configuration management space, but Puppet is definitely a strong enterprise-class solution that has been widely deployed for more than a decade. I focus on Docker and Puppet for the remainder of this article in order to avoid gross generalizations that may not apply equally to every containerization or configuration management tool currently available.

Use Cases for Integrating Docker and Puppet

If you're deploying services into a Docker-based infrastructure, why add Puppet to the toolchain? After all, one of the most common uses of Docker is to deploy preconfigured services out to the cloud. If the services are preconfigured, how would using a configuration management tool like Puppet improve the process?

There are three core use cases to consider. The first is using Puppet to provision the Docker service on a host, so that it is available to manage Docker instances. The second is provisioning specific Docker instances, such as a containerized web service, onto a managed host. The third is managing complex or dynamic configurations inside Docker containers using Puppet agents baked in to the Docker image. In this article, I address only the first use case, but I plan to address the others in future articles.

Provisioning Docker with Puppet

Docker is a great tool, but it isn't installed by default on most Linux distributions. When your cloud or data center has tens of thousands of nodes running Linux, how do you install the Docker dæmon on only the nodes that need it?

In a homogeneous server farm, you might use a template or image that already has Docker on it, and whatever process spins up the node will ensure that Docker is available for running containers. However, in a more heterogeneous environment, only some of the nodes will be hosting Docker containers, and without a configuration management system, you can't easily add or remove Docker from existing hosts in a well controlled, automated way. Managing packages, files and services on existing nodes is what Puppet was made for!

Imagine that you have an Ubuntu server with no special characteristics or running services other than SSH. As a best practice, this server already should have the Puppet agent installed in order to avoid yet another bootstrap problem, but as long as you already have a Puppet master in your environment, placing this new node under configuration management is easy.

The specifics of doing this may vary slightly by Linux distribution, Puppet version and whether you are using Puppet open source or Puppet Enterprise. Although the commands may vary, the steps will be very similar.

First, you'll configure some basic items on the server. Second, you'll install the Puppet agent, register the client with the server and kick off the provisioning process.

Server-Side Settings

Automating Which Nodes Get Docker Installed

At scale, it can be very helpful to apply certain configurations automatically to nodes based on the node's hostname. To see this in action, create a Puppet environment named “example”:

sudo mkdir -p \
    /etc/puppet/code/environments/example/manifests

sudo touch \
    /etc/puppet/code/environments/example/manifests/site.pp

And, assign the docker class to any Puppet client that has “docker” in its hostname:

# /etc/puppet/code/environments/example/manifests/site.pp

# Use a regular expression to assign the docker
# class to any node that contains "docker" in its
# hostname.
node /docker/ {
  include docker
}

Autosigning Client Certificates

By default, Puppet operates in client/server mode and uses SSL certificates on both the client and server sides for authentication as well as transport security. If your Puppet master is not on the same machine as the client, you can make your life easier (but a little less secure) by allowing Puppet to sign client certificates from whitelisted hosts automatically. For example, depending on your naming conventions for subdomains and hostnames, one or more of the following entries could be adapted to whitelist Docker nodes:

# /etc/puppet/autosign.conf

docker-host-001.example.com
*.docker-hosts.localdomain
*.local

Autosigning client certificates reduces security in exchange for a dramatic boost in convenience and productivity. Within a secure network, this often is worth the modest risk. However, you also can configure more secure policy-based autosigning with Puppet, but doing so is well outside the scope of this article.

In the short term, if you prefer not to use autosigning, you can pass the --waitforcert flag to the Puppet agent and then manually approve unsigned client certificates on the Puppet master while the clients wait. For small numbers of servers, the manual nature of this process is offset by the fact that it has to be done only once per client. However, as a process, it simply scales poorly. Policy-based autosigning is definitely the right way to go for the security-conscious enterprise.

Defining a Docker Manifest

Now Puppet is ready to do its real job: using a declarative domain-specific language to place each node into a known state. Create the following manifest for Ubuntu to tell your designated Docker nodes how to install and start the Docker service:

# /etc/puppet/code/environments/example/manifests/docker.pp

class docker {
  package { 'docker':
    name   => 'docker.io',
    ensure => present,
  } ->
  service { 'docker':
    ensure => running,
  }
}

Note that on recent Ubuntu versions, the Docker package was renamed docker.io to avoid conflicts with an unrelated package that is also named docker, but the service script and process name are still docker. This is potentially confusing, but the Puppet manifest above handles this situation with ease. The manifest should be tweaked for other distributions or replaced with a suitable module from the Puppet Forge that selects the appropriate package name based on the client's distribution and OS version.

Client-Side Settings

Remember, you're managing the Puppet agent manually here because it's not yet part of your default OS installation. Once you have a configured Puppet master, and the Puppet agent is baked in to your default OS installation process, the client installation (and in many cases the certificate management process) will be automatic.

Installing the Puppet Agent

You've configured the basic elements of the Puppet server to support Docker. Now you need to configure the client. First, install the Puppet agent:

sudo apt-get --assume-yes install puppet-agent

Next, you have to assign the agent to a Puppet environment. By default, all Puppet agents are assigned to production, but it's a best practice to perform these sorts of experiments within a dedicated environment that won't impact your real production systems. There are ways to set the client's environment server-side, but they are out of scope for this exercise. Instead, you'll use the agent's configuration file and command-line options to ensure that you're using manifests defined for a dedicated, non-production environment:

# This command will configure the correct agent
# environment.
sed -i '$a\\n[agent]\nenvironment = example' \
    /etc/puppet/puppet.conf

If the command above worked properly, the puppet.conf file should now look similar to this:

# /etc/puppet/puppet.conf

[main]
ssldir = /var/lib/puppet/ssl

[master]
vardir = /var/lib/puppet
cadir  = /var/lib/puppet/ssl/ca
dns_alt_names = puppet

[agent]
environment = example

Once the Puppet agent is installed, and assuming the existence of a Puppet master named puppet that has been configured to autosign client certificates, this system will install Docker on new nodes automagically any time the hostname matches the regular expression defined in the site.pp node list. Simply ensure that each node that should run Docker contains “docker” in its hostname and you're done!

However, it's often wise to kick off the first agent run manually, especially so that you can spot any problems with server connectivity or SSL certificates quickly. You already defined a Puppet environment named “example”, and the agent will attempt to contact a server named “puppet” unless directed otherwise by puppet.conf or on the command line. The following commands show how to define both the server and the environment explicitly, and it will override any configuration file settings or defaults if necessary:

# Running agent with puppet.conf and/or default
# values.
sudo puppet agent --test

# Overriding the server and environment values.
sudo puppet agent \
    --test \
    --server ubuntu-yakkety.localdomain \
    --environment example

Once the Puppet agent has completed its run, you quickly can validate that the Docker service is running properly—for example:

# Show verbose status of the Docker service.
$ sudo service docker status

# Count of running Docker processes.
$ pgrep -c docker
1

Putting It All in Context

At this point, you may be thinking that all this Puppet stuff seems like a lot more work than simply running apt-get -y install docker.io on each new Linux machine in your data center. In the short term, using Puppet to install Docker certainly requires more work up front. However, this will really pay off in the long term when you have large numbers of nodes to manage and you're attempting to provision them in a fully automated way. In addition, don't overlook the value of how easily Puppet can automate critical patches related to Docker, or the assurance that Puppet will enforce (and report any changes to) the expected status of the Docker service every time the Puppet agent runs.

In my next article, I'll expand on Docker/Puppet integration further to install, start or stop Docker containers across the data center based on centrally managed roles and profiles. If you aren't already convinced that Docker and Puppet make a powerful one-two combination, you won't want to miss the next installment.

Todd A. Jacobs is a frequent contributor to Linux Journal, a Stack Exchange enthusiast, and a practicing DevOps and Automation Architect with a special focus on automated security. He currently lives in Baltimore with his beautiful wife, toddler-aged son, and two geriatric but lovable dogs.