homelab automation with pyinfra

During the Christmas break I crawled into the storage / datacenter / Harry Potter-esque bedroom under my stairs and retrieved an Intel Nuc that had been powered off for a year++, added an additional SSD, and installed Arch Linux to setup a small lab for learning new things and brushing up forgotten skills.

As I do love breaking my lab equipment I also needed to automate most of my configuration to make it easier to return to a known good state, but which one?

Ansible? No, I have written enough playbooks in my career
Chef? No. Not a fan of the master-client architecture, or their Ruby DSL
Puppet? No. Again not a fan of the master-client architecture and too complicated for what I need
SaltStack? Never been a fan, and I'm not going to touch anything owned by Broadcom unless there is (a lot) of money involved
CFEngine? It's still alive? wow! But no. I do not want to re-learn that thing

So what did I end up using?

After browsing half the internet for various alternatives I decided to go with pyinfra. It's open source, easy and uncomplicated, agentless, and I like python. After playing around with it a bit and writing my first deployments I understand their tagline:

Think ansible but Python instead of YAML

And that is a statement I can like.

Writing pyinfra deployments

Pyinfra is essentially just a Python package. Which means we also write our configuration as code in Python, and we get easy access to all of the useful libraries Python has available.

After spending many years with various domain specific languages and working around their quirks I have grown to appreciate it when I can use a "real" programming language. If we need to query information from a REST API we can do that with requests, if we want to massage a lot of data we can use Pandas, and when we need control structures like loops or conditionals there are (almost) no weird limitations or constraints we need to be aware of.

The docs have a nice little "Getting Started" that shows both how to run ad-hoc commands and more stateful deployments.

Frustrations

Of course, no tools Is perfect, and in my very first steps I have a few frustrations:

The ecosystem isn't very mature. The examples are pretty bare, and the number of re-usable modules/packages made by others is minimal. So I end up writing everything myself the way I want it.
Supporting multiple linux distributions is tedious. I.e. Ansible has ansible.builtin.package as a generic overlay, but in pyinfra you would need to use pacman.packages for Arch Linux, apt.packages for Debian/Ubuntu or brew.packages for Homebrew on MacOS etc. Writing conditional logic to first get OS Facts and then the correct package manager isn't very fun
Order of operations when using conditionals can give you some surprises. Reading "How pyinfra Works" is highly recommended

I have published my (very tiny) collection of examples at github.com/sjovang/pyinfra-homelab-linux-config and will update the repository with more examples (and if I ever get around to it/need it support more linux distress too). Feel free to use them as inspiration or in your own projects =)