Reliably host your own PKG repository for FreeBSD

As promised, this is part 2 in a short series centered around using FreeBSD on Raspberry Pi and similar small systems. The first part explained how to go about building your own set of up-to-date, customized binary packages on your big x86 PC. In this part I’ll show you one possible option for hosting these precious packages online, so your Pi can easily access them. There are other options, some cheaper than what I’ll be showing here, but my setup is hard to beat on reliability.

So, what are we dealing with here? I used Poudriere, FreeBSD’s own build-framework for binary packages, to create a bunch of packages for use by my fleet of Raspberry Pi computers. While it’s possible to scp the packages to the Pi and install them from there, that quickly becomes tedious when dealing with many Pi’s. You lose automatic update-goodness that comes from running a real online PKG repository. All you’ll need is a bog-standard web server to serve the collection of files from Poudriere.

Enter Amazon Web Services S3

I choose to use Amazon’s S3 storage platform to host my PKG mirror. I have a bunch of reasons to choose this over self-hosting.

  • I won’t be using terabytes of storage, making S3 an affordable proposition.
  • I’m an AWS-certified solutions architect during the day, so I know this particular neck of the woods better than I do other public clouds.
  • S3 is *massively* more reliable than anything I could hope to build myself at home.
  • S3 buckets can easily be exposed as HTTP-services, precisely what we’ll need to host flat files like a PKG repo.
  • Copying a directory-tree over to S3 is a problem easily solved using off-the-shelf tools from FreeBSD’s ports collection.

So while I’d be perfectly capable of popping an Nginx-based web server into existence from something like my NAS, it’s simply not worth the effort. If I ever decide to publish my repo-mirror to the world, I also feel much more comfortable security-wise having it “up there” in the cloud on its own instead of inside my home network. What you’ll need to do, is create an S3-bucket inside your account and enable website-hosting it. Most of that is very much self-explanatory and I won’t waste this blog on walking you through the steps.

Keeping your packages in sync

You’ll need to upload the packages directory to S3. In my case, that’s /usr/local/poudriere/data/packages to be exact. Uploading this initially is simple. There won’t be anything in S3 so by using the AWS CLI command from devel/awscli we’ll be able to upload things from our shell prompt (and cron jobs later on). For this specific purpose I created a separate user in AWS IAM with its permissions restricted so that it can only manage the contents of our S3 bucket. Once again AWS has plenty of excellent docs on how to do this so I’m skipping over that for now. The relevant bit is that this user will have an AWS API key and accompanying secret. You’ll need to enter these bits of information into a text file located at /root/.aws/credentials. Don’t forget to chmod this file 0400 and make sure it’s owned by root.

The contents of the file look something like this:

[default]
aws_access_key_id=XXXXXXXXXX
aws_secret_access_key=YYYYYYYY

Enter the relevant bits for your setup at the X’es and Y’s. If you did everything correctly, your root user will now be able to send files to S3. To upload your packages directory to S3, execute following command:

/usr/local/bin/aws s3 sync /usr/local/poudriere/data/packages/ s3://your-bucket-name

This should pick up the credentials-file and then start uploading your stuff to S3. Guess what: that’s all there’s to it essentially. Sadly, though, life isn’t this simple if you care about actually keeping your stuff up-to-date. Unless you’re looking forward to performing the above chore every day, you’ll like a way to automate this stuff. I devised a few scripts for that which run from Cron. The global process works like this:

  1. Have Poudriere build a fresh set of packages.
  2. Drop a file named SYNCME into the package directory
  3. Have a Cron-task check every minute for the existence of this SYNCME (and SYNCING) file. If it’s not there, exit without doing anything. If it is there, start the repository-sync script.
  4. Have the repository-sync script check for SYNCME again and also check for a file named SYNCING. If SYNCME is there but SYNCING isn’t, swap them around so that we’re left with only SYNCING.
  5. Perform the S3 sync-task as desribed above, but with a –delete and –quiet flag added so that old files will get cleaned-up from S3 and cron-logs won’t be spammed too much.
  6. Delete the SYNCING file to indicate the process is done.

The actual set of scripts is available from here. I’m in the annoying habit if dropping admin-scripts in /mgt/bin so either change them as appropriate or follow my habit. I’m running /mgt/bin/sync_cron.sh as root every ten minutes, with a line like this in root’s crontab:

*/10 * * * * /mgt/bin/sync_cron.sh

If you like to have extensive logging, you could use /usr/bin/logger to write things to syslog at any point during the script.

Using your fresh new mirror

On any FreeBSD system that will be using your new mirror, add a file such as the following in /usr/local/etc/pkg/repos/yourrepo.conf

MyRepo: {
  url: "http://your-bucket-url/jailname/"
  mirror_type: "http"
  enabled: yes
  priority: 10
}

This enables your repository in the most basic sense imaginable and it should actually function. By setting the priority of your repository higher than that of the base system, packages from your repository will be preferred over those from upstream. Go ahead and test it, but don’t use it in this manner in production! As it stands, there is no security against manipulation of your packages in-transit. Sadly, S3 doesn’t allow bucket-hosting over HTTPS. While it would be nice to have, we could also use the PKG framework itself to cryptographically guard against manipulations. This entails generating an RSA key pair and using that to sign the repository metada. We’ll use OpenSSL on our build system to generate the necessary keys.

openssl genrsa 2048 > /usr/local/etc/poudriere.key
chmod 0400 /usr/local/etc/poudriere.key
chown root:wheel /usr/local/etc/poudriere.key

These commands generate a private key and reasonably protect it from tampering by unauthorized users. The private key is a highly confidential artifact. Anyone who can read this file will be able to forge your packages’ electronic signature, essentially rendering the whole mechanism void. Make sure you protect your key properly. Now to generate the public counterpart for use on your target systems:

openssl rsa -in /usr/local/etc/poudriere.key -pubout /usr/local/etc/poudriere.pub

The resulting file is a so-called public key. This part doesn’t need to be treated as confidential. In order to actually have Poudriere sign your repository, add the following line to your poudriere.conf:

PKG_REPO_SIGNING_KEY=/usr/local/etc/poudriere.key

The next Poudriere-build after this change will have its repository metadata cryptographically signed and protected from tampering. You just need to update the repository-definitions on your consuming systems to use the public key to verify the signatures. This is done by uploading the public key to each of these systems and changing the repository configuration file accordingly. In our case /usr/local/etc/pkg/repos/yourrepo.conf will turn into this:

MyRepo: {
  url: "http://your-bucket-url/jailname/"
  mirror_type: "http"
  signature_type: "pubkey"
  pubkey: "/usr/local/etc/poudriere.pub"
  enabled: yes
  priority: 10
}

From now on, whenever you attempt to install a package from your repository, PKG will complain loudly when a signature doesn’t match. The cryptography used in this process is of equal strength to that used to protect HTTPS from eavesdropping, so I don’t feel all that bad about omitting HTTPS on this particular service. If you do want it, have a look at deploying AWS Cloudfront in front of your S3 bucket.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.