Managing Docker Instances with Puppet

Todd A. Jacobs

Issue #274, February 2017

Docker provides a powerful tool for creating lightweight images and containerized services, while Puppet provides the means to deploy and manage those same images and containers as a standard part of the configuration management lifecycle. Whether you're working in the cloud or the data center, this one-two punch is a real knockout!

In this in-depth article, you'll learn how to use Puppet roles and profiles to assign Docker images and containers to an unlimited number of nodes based on standardized naming conventions. If you're not careful, you also might learn a few tips and tricks about Vagrant, Linux hostnames and SSH along the way.

My previous article, “Provisioning Docker with Puppet” in the December 2016 issue, covered one of the ways you can install the Docker service onto a new system with Puppet. By contrast, this article focuses on how to manage Docker images and containers with Puppet.

Container management with Puppet allows you to do a number of things that become ever more important as an organization scales up its systems, including the following:

  1. Leveraging the organization's existing configuration management framework, rather than using a completely separate process just to manage Docker containers.

  2. Treating Docker containers as “just another resource” to converge in the configuration management package/file/service lifecycle.

  3. Installing Docker containers automatically based on hostname, node classification or node-specific facts.

  4. Orchestrating commands inside Docker containers on multiple hosts.

Although there certainly are other ways to achieve those goals (see the Picking a Toolchain sidebar), it takes very little work to extend your existing Puppet infrastructure to handle containers as part of a node's role or profile. That's the focus for this article.

Creating a Test Environment

To follow along with the code listings and examples in the remainder of this article, ensure that Vagrant and VirtualBox are already installed. Next, you'll prepare a set of provisioning scripts to configure a test environment on an Ubuntu virtual machine.

Preparing Your Provisioning Scripts

Create a directory to work in, such as ~/Documents/puppet-docker. Place the Vagrantfile and docker.pp manifest within this directory (see Listings 1 and 2).

The Vagrantfile is a Ruby-based configuration file that Vagrant uses to drive one or more “providers”. Vagrant supports VirtualBox, Hyper-V and Docker by default, but it also supports many other providers, such as VMware Fusion, DigitalOcean, Amazon AWS and more. Because the goal is to simulate the management of the Docker dæmon, images and containers on a full-fledged OS, let's focus on the cross-platform VirtualBox provider.

Note that this particular Vagrantfile (Listing 1) installs Puppet 3.8.5, which is the currently supported version for Ubuntu 16.04.1 LTS. Different versions are available as Puppet Enterprise packages or Ruby gems, but this article focuses on the version provided by Ubuntu for its current long-term support release.

Docker.pp is a Puppet manifest that uses a declarative syntax to bring a node into a defined state. The docker.pp manifest (Listing 2) makes use of an officially supported Puppet Forge module that takes care of a great deal of low-level work for you, making the installation and management of the Docker dæmon, images and containers easier than rolling your own.

On some systems, a Puppet manifest can enable a basic Docker setup with only a simple include 'docker' statement. However for this article, you will override specific settings, such as your user name on the guest OS, so that the right user is added to the group that can communicate with the Docker dæmon. You also will override the name of the Docker package to install, as you want to use the Ubuntu-specific “docker.io” package rather than the upstream “docker-engine” package the Puppet module uses by default.

When placing docker.pp into the same directory as the Vagrantfile, Vagrant will make the Puppet manifest available inside the virtual machine automagically using its synced folder feature. As you will see shortly, this seemingly minor step can pay automation-friendly dividends when provisioning the guest OS.

Provisioning with Puppet Apply

With the Vagrantfile and docker.pp stored in your working directory, you're ready to launch and configure the test environment. Since this article is all about automation, let's go ahead and script those activities too.

Create the shell script shown in Listing 3 in the same directory as the Vagrantfile. You can name it anything you like, but a sensible name, such as vagrant_provisioning.sh, makes it clear what the script does.

Make the script executable with chmod 755 vagrant_provisioning.sh, then run it with ./vagrant_provisioning.sh. This will start and configure the virtual machine, but it may take several minutes (and a great deal of screen output) before you're returned to the command prompt. Depending on the horsepower of your computer and the speed of your internet connection, you may want to go make yourself a cup of coffee at this point.

When you're back with coffee in hand, you may see a number of deprecation warnings caused by the Puppet Forge module, but those can be safely ignored for your purposes here. As long as the docker.pp manifest applies with warnings and not errors, you're ready to validate the configuration of both the guest OS and the Docker container you just provisioned.

Believe it or not, with the docker.pp Puppet manifest applied by the Puppet agent, you're already done! You now have a Docker container running Apache, and serving up the default “It works!” document. You can test this easily on your top-level host with curl localhost:8080 or at http://localhost:8080/ in your desktop browser.

Don't be fooled. Although you haven't really done anything yet that couldn't be done with a few lines at the command prompt, you've automated it in a consistent and repeatable way. Consistency and repeatability are the bedrock of automation and really can make magic once you extend the process with roles and profiles.

Controlling Docker with Puppet Roles and Profiles

It may seem like a lot of work to automate the configuration of a single machine. However, even when dealing with only a single machine, the consistency and repeatability of a managed configuration is a big win. In addition, this work lays the foundation for automating an unlimited number of machines, which is essential for scaling configuration management to hundreds or thousands of servers. Puppet makes this possible through the “roles and profiles” workflow.

In the Puppet world, roles and profiles are just special cases of Puppet manifests. It's a way to express the desired configuration through composition, where profiles are composed of component modules and then one or more profiles comprise a role. Roles are then assigned to nodes dynamically or statically, often through a site.pp file or an External Node Classifier (ENC).

Let's walk through a simplified example of what a roles-and-profiles workflow looks like. First, you'll create a new manifest in the same directory as your Vagrantfile named roles_and_profiles.pp. Listing 4 shows a useful example.

Note that all the profiles, roles and nodes are placed into a single Puppet manifest. On a production system, those should all be separate manifests located in appropriate locations on the Puppet Master. Although this example is illustrative and extremely useful for working with masterless Puppet, be aware that a few rules are broken here for the sake of convenience.

Let me briefly discuss each section of the manifest. Profiles are the reusable building blocks of a well organized Puppet environment. Each profile should have exactly one responsibility, although you can allow the profile to take optional arguments that make it more flexible. In this case, the Alpine profiles allow you to add or remove a given Docker image depending on the value of the $status variable you pass in as an argument.

A role is the “reason for being” that you're assigning to a node. A node can have more than one role at a time, but each role should describe a singular purpose regardless of how many component parts are needed to implement that purpose. In the wild, some common roles assigned to a node might include:

  • role::ruby_on_rails

  • role::jenkins_ci

  • role::monitored_host

  • role::bastion_host

Each role is composed of one or more profiles, which together describe the purpose or function of the node as a whole. For this example, you define the alpine34 role as the presence of the Docker dæmon with Alpine 3.4 and the absence of an Alpine 3.3 image, but you could just as easily have described a more complex role composed of profiles for NTP, SSH, Ruby on Rails, Java and a Splunk forwarder.

This separation of concerns is borrowed from object-oriented programming, where you try to define nodes through composition in order to isolate the implementation details from the user-visible behavior. A less programmatic way to think of this is that profiles generally describe the features of a node, such as its packages, files or services, while roles describe the node's function within your data center.

Nodes, which are generally defined in a Puppet Master's site.pp file or an external node classifier, are where roles are statically or dynamically assigned to each node. This is where the real scaling power of Puppet becomes obvious. In this example, you define two different types of nodes. Each node definition uses a string or regular expression that is matched against the hostname (or certname in a client/server configuration) to determine what roles should be applied to that node.

In the node section of the example manifest, you tell Puppet to assign role::alpine33 to any node that includes “alpine33” as part of its hostname. Likewise, any node that includes “alpine34” in the hostname gets role::alpine34 instead. Using pattern-matching in this way means that you could have any number of hosts in your data center, and each will pick up the correct configuration based on the hostname that it's been assigned. For example, say you have five hosts with the following names:

  1. foo-alpine33

  2. bar-alpine33

  3. baz-alpine33

  4. abc-alpine34

  5. xyz-alpine34

Then, the first three will pick up the Alpine 3.3 role when they contact the Puppet Master, and the last two will pick up the Alpine 3.4 role instead. This is almost magical in its simplicity. Let's see how this type of dynamic role assignment works in practice.

Dynamic Role Assignments

Assuming that you've already placed roles_and_profiles.pp into the directory containing your Vagrantfile, you're able to access the manifest within the Ubuntu virtual machine. Let's log in to the VM and test it out (Listing 5).

Next, run the roles_and_profiles.pp Puppet manifest to see what happens. Hint: it's going to fail, and then you're going to explore why that's a good thing. Here's what happens:

ubuntu@ubuntu-xenial:~$ sudo puppet apply --modulepath 
 ↪~/.puppet/modules /vagrant/roles_and_profiles.pp
Error: Could not find default node or by name with 
 ↪'ubuntu-xenial.localdomain, ubuntu-xenial' on node 
 ↪ubuntu-xenial.localdomain
Error: Could not find default node or by name with 
↪'ubuntu-xenial.localdomain, ubuntu-xenial' on node 
 ↪ubuntu-xenial.localdomain

Why did the manifest fail to apply? There are actually several reasons for this. The first reason is that you did not define any nodes that matched the current hostname of “ubuntu-xenial”. The second reason is that you did not define a default to be applied when no other match is found. Puppet allows you to define a default, but in many cases, it's better to raise an error than to get a configuration you weren't expecting.

In this test environment, you want to show that Puppet is able to assign roles dynamically based on the hostname of the node where the Puppet agent is running. With that in mind, let's modify the hostname of the Ubuntu guest to see how a site manifest can be used to configure large clusters of machines appropriately based solely on each machine's hostname.

In order to avoid errors with the sudo command, you actually need to change the hostname of your virtual machine in several places. In addition, the hostname reported by the PS1 prompt will not be updated until you start a new shell. The following commands, when run inside the Ubuntu guest, will make the necessary changes:

# Must be exported to use in sudo's environment.
export new_hostname="foo-alpine33"

# Preserve the environment or sudo will lose the
# exported variable. Also, we must explicitly
# execute on localhost rather than relying on
# whatever sudo thinks the current hostname is to
# avoid "sudo: unable to resolve host" errors.
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    sed --in-place \
        "s/${HOSTNAME}/${new_hostname}/g" \
        /etc/hostname /etc/hosts
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    hostname "$new_hostname"

# Replace the current shell in order to pick up the
# new hostname in the PS1 prompt.
exec "$SHELL"

Your prompt now should show that the hostname has changed. When you re-run the Puppet manifest, it will match the node list because you've defined a rule for hosts that include “alpine33” in the hostname. Puppet then will apply role::alpine33 for you, simply because the hostname matches the node definition! For example:

# Apply the manifest from inside the Ubuntu guest.
sudo puppet apply \
    --modulepath ~/.puppet/modules \
    /vagrant/roles_and_profiles.pp

# Verify that the role has been correctly applied.
docker images alpine

REPOSITORY   TAG       IMAGE ID        CREATED         SIZE
alpine       3.3       6c2aa2137d97    7 weeks ago     4.805MB

To apply this role to an entire cluster of machines, all you need to do is ensure they have hostnames that match your defined criteria. For example, say you have five hosts with the following names:

  1. foo-alpine33

  2. bar-alpine33

  3. baz-alpine33

  4. abc-alpine33

  5. xyz-alpine33

Then, the single node definition for /alpine33/ would apply to all of them, because the regular expression matches each of their hostnames. By assigning roles to patterns of hostnames, you can configure large segments of your data center simply by setting the proper hostnames! What could be easier?

Reassigning Roles at Runtime

Well, now you have a way to assign a role to thousands of boxes at a time. That's impressive all by itself, but the magic doesn't stop there. What if you need to reassign a system to a different role?

Imagine that you have a box with the Alpine 3.3 image installed, and you want to upgrade that box so it hosts the Alpine 3.4 image instead. In reality, hosting multiple images isn't a problem, and these images aren't mutually exclusive. However, it's illustrative to show how you can use Puppet to add, remove, update and replace images and containers.

Given the existing node definitions, all you need to do is update the hostname to include “alpine34” and let Puppet pick up the new role:

# Define a new hostname that includes "alpine34"
# instead of "alpine33".
export new_hostname="foo-alpine34"

sudo \
    --preserve-env \
    --host=localhost \
    -- \
    sed --in-place \
        "s/${HOSTNAME}/${new_hostname}/g" \
        /etc/hostname /etc/hosts
sudo \
    --preserve-env \
    --host=localhost \
    -- \
    hostname "$new_hostname"
exec "$SHELL"

# Rerun the manifest using the new node name.
sudo puppet apply \
    --modulepath ~/.puppet/modules \
    /vagrant/roles_and_profiles.pp

# Show the Alpine images installed.
docker images alpine

REPOSITORY   TAG       IMAGE ID        CREATED         SIZE
alpine       3.4       baa5d63471ea    7 weeks ago     4.803MB

As you can see from the output, Puppet has removed the Alpine 3.3 image, and installed Alpine 3.4 instead! How did this happen? Let's break it down into steps:

  1. You renamed the host to include the substring “alpine34” in the hostname.

  2. Puppet matched the substring using a regular expression in its node definition list.

  3. Puppet applied the Alpine 3.4 role (role::alpine34) assigned to nodes that matched the “alpine34” substring.

  4. The Alpine 3.4 role called its component profiles (which are actually parameterized classes) using “present” and “absent” arguments to declare the intended state of each image.

  5. Puppet applied the image management declarations inside the Alpine 3.3 and Alpine 3.4 profiles (profile::alpine33 and profile::alpine34, respectively) to install or remove each image.

Although hostname-based role assignment is just one of the many ways to manage the configuration of multiple systems, it's a very powerful one, and certainly one of the easiest to demonstrate. Puppet supports a large number of ways to specify what configurations should apply to a given host. The ability to configure systems dynamically based on discoverable criteria makes Puppet a wonderful complement to Docker's versioned images and containerization.

Conclusion

In this article, I took a close look at managing Docker images and containers with docker::image and docker::run, but the Puppet Docker module supports a lot more features that I didn't have room to cover this time around. Some of those additional features include:

  • Building images from a Dockerfile with the docker::image class.

  • Managing Docker networks with the docker::networks class.

  • Using Docker Compose with the docker::compose class.

  • Implementing private image registries using the docker::registry class.

  • Running arbitrary commands inside containers with the docker::exec class.

When taken together, this powerful collection of features allows you to compose extremely powerful roles and profiles for managing Docker instances across infrastructure of almost any scale. In addition, by leveraging Puppet's declarative syntax and its ability to automate role assignment, it's possible to add, remove and modify your Docker instances on multiple hosts without having to manage each instance directly, which is typically a huge win in enterprise automation. And finally, the standardization and repeatability of Puppet-driven container management makes systems more reliable when compared to hand-tuned, hand-crafted nodes that can “drift” from the ideal state over time.

In short, Docker provides a powerful tool for creating lightweight golden images and containerized services, while Puppet provides the means to orchestrate those images and containers in the cloud or data center. Like strawberries and chocolate, neither is “better” than the other; combine them though, and you get something greater than the sum of its parts.

Todd A. Jacobs is a frequent contributor to Linux Journal, a Stack Exchange enthusiast, and an industry leader in DevOps transformations that incorporate automated security and IT governance. He currently lives in Baltimore with his beautiful wife, toddler-aged son and two geriatric but lovable dogs.