Here is why Puppet won’t let us have nice things

So you’re building up your DevOps kung-fu like there’s no tomorrow using Puppet. More power to you, it’s what I spend plenty of my days doing as well. Sadly, though, for all the good that is Puppet it just won’t let me have my cake and eat it too. Here’s why.

Package management is a major disaster on just about any infrastructure. Now before you go off about how tools like Yum automate the heck out of that, let me stop you there for a bit. We’re talking Puppet-managed infrastructure here. The kind of place where you ensure => ‘latest’ your packages and then forget about them. Now while this works just admirably in simple cases, it’s not so pretty when you’re dealing with servers that get composed from a large number of classes. How so?

Let’s say you have something like Apache and PHP running on your FreeBSD server. Both of these will, at some point, want to pull in something like Perl 5. You can’t ensure the Perl 5 package from both modules because conflict will happen. Enter ensure_packages() from stdlib. By creating all your package resources through this function, you prevent conflicts as described here. Sadly, for this to work properly it’s imperative that *all* Puppet code uses this function to ensure its packages or things will still go boom.

But then there’s dependencies. So Apache pulled in Perl 5. While you dutifully ensured the apache24 package ‘latest’, the Perl 5 package will be left to bit-rot in its sad little corner of the world. Puppet simply doesn’t care because it doesn’t even know about it. Perl 5 was the OS package manager’s idea, not Puppet’s. So the indifference even makes sense. Want to work around this? Manage *all* your dependencies using ensure_packages and be done with it. This will make sure that no package gets left behind with the next Puppet run.

So far, so good: we have practically automatic package management for our entire stack. Where this all breaks down, however, is upon removal of a class which has packages in common with another class that you actually do want on your server. How so? For the sake of argument let’s assume you have a standard server profile that includes Apache, PHP and some squeaky monitoring package written in Perl.

Each of them gets added by its own Puppet class. So far, so good: ensure_packages makes certain that only one instance of Perl 5 gets installed. However, imagine an exceptional case in which the monitoring package must *NOT* be present for whatever reason. Puppet, by default, completely forgets about the existence of any resource that simply gets removed from its catalog. This would leave the monitoring package installed but orphaned and unmanaged on the target server: less than optimal. The sensible thing to do would be to ensure its class as ‘absent’ in some Hiera-based exception, with the class itself taking care of the removal of all its resources. Sadly, that is where stuff breaks.

When you tell ensure_packages to absent a package, it will dutifully remove it. Even when some other class ensure_packages the package to ‘latest’. Now this behaviour is largely dependent on the order in which resources end up in the catalog. The end result usually is a set of ‘flapping’ Puppet runs. One run removes the package, leaving a trail of breakage in its wake, which the next Puppet run attempts to recover. Rinse, repeat, deal with loads of breakage.

Personally I consider this entire issue, an omission on the part of Puppet’s model of a target configuration. The existence of the ensure_packages function in stdlib somewhat masks this and a quick fix may be possible.

It would be a huge help if Puppet would allow its owner to specify the behaviour of the ensure_packages function in the face of conflicting instructions. By simply allowing the admin to specify that ‘present’ and ‘latest’ overrule ‘absent’, the behaviour would make a lot more sense (at least to me). We would have Apache and PHP requesting Perl 5 to be installed, and the monitoring package requesting it to be removed. Since there are still packages left that need Perl 5, their ‘vote’ should overrule the monitoring package’s class and leave the package in place. Only when all classes unanimously request for Perl 5 to be absent, should it actually be removed.

However, at its core, Puppet should be expanded to allow users to model the actual dependencies that are present in their underlying OS anyway. As it currently stands, Puppet only partly solves the problem of package management and ensure => ‘latest’ creates more grief than it’s worth.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.