How LEGO Taught Me About Buying Platform Software

By
Max Baehr
June 2, 2023

This is a story about the most fundamental decision you make when solving problems with software: build vs. buy. It’s also a story about LEGO. It’s also a story about managing temporary elevated access. But mostly, it’s a story about buying software. Or building it. Or really, both.

About me

I’m a 90s kid with two older siblings, and none of us are close in age. What these two facts mean – among other, less relevant things – is that I grew up 1) during the first peak of LEGO rapidly expanding into themed sets, and 2) at the bottom of a cascade of two generations of  accumulated bricks.

I built spaceships, medieval castles, underwater labs – it was all very exciting.

Later down the road (we’re skipping some steps here) my parents moved away, I left home, and all the bricks returned to stasis: boxes inside of boxes, thousands of tiny specimens (roughly) sorted by form, function, and color, stowed in an attic.

I kept two sets:

  • The LEGO City Street Sweeper
  • A little transforming gorilla/land-speeder thing that I made nearly 30 years ago

Build vs. buy

When solving problems with software, most decisions eventually reduce to some version of build vs. buy. Of course, that “choice” is usually more of a spectrum: you buy things, you configure what you can, you build to cover the gaps, and you integrate into the rest of your systems.

In the context of a platform team, this is [checks notes] the whole job. Or at least, a lot of it: discovery, requirements gathering, vendor search, and ultimately, the hunt for a sane, maintainable suite of solutions that set your org up for success both today and tomorrow.

Our co-founder Adam covered this topic in his platform engineering as a startup post a couple months ago.

LEGO captures this puzzle pretty neatly. You’re not likely to cast your own plastic bricks any more than you’re likely to write a language and a compiler (although in either case, if you are, um, congrats). Assuming the existence of both common and use-case-specific block types, your common software and LEGO puzzles both reduce to:

  • What am I trying to solve?
  • How special do the blocks need to be?
  • What comes packaged vs. a la carte?
  • How much work (and/or cost) is it for me to put together?

The street sweeper

Sometimes the only thing you need to do is sweep a street. Street sweepers are really good at that. They’re also good at precisely zero other things. I suppose you could ride one for transportation if you really wanted to, but I wouldn’t recommend it.

Going back to our LEGO City, you could also absolutely build other stuff out of the Street Sweeper kit. But you’re either going to end up using the special sweeper pieces in weird, unintended ways, or skip them altogether. In either case, the result is suboptimal, and you probably should’ve bought a different kit.

The gorilla/land-speeder thing

Sometimes, you need to do multiple things, like traverse both the rocky and flat terrain of an imaginary bedroom landscape. For that specific problem, there wasn’t a single kit that had everything I needed – but it wasn’t hard to cobble together from a combination of generic and purpose-built parts. My 10-year-old imagination decided that a mix of hinges, engines, and a little cockpit would do the trick.

This thing still sits on my desk as a reminder that there aren’t very many problems narrow enough for a street sweeper, and some of the more interesting problems don’t come with ready-made kits – you might have to dig through a few boxes to find just the right pieces.

Temporary elevated access

Temporary access is not a street sweeper problem. It’s easy enough to communicate the intent and goal (“temporary access for ____”), but the specifics are in constant flux. Teams grow and shrink and change, tools are added and subtracted and configured, and in the end, your ideal supporting toolchain probably looks a lot like molding your own bricks and building from scratch.

But building from scratch is very costly. In fact, everywhere I’ve worked prior to Sym has done exactly that: build a unique in-house tool that exactly solves the problem as defined at the moment the project spec was completed.

No points for guessing how that approach tends to age.

What about buying off-the-shelf software?

The safety, speed, and compliance you can get from folks having access to just the right things at just the right times is not a mix delivered gracefully by – as one of our customers once put it – “yet another f’ing UI appliance.” Not much more to say there.

What about no- or low-code tools?

Tools like Zapier and Workato attempt to solve major pieces of the build/buy puzzle by providing semi-configurable glue. Personally, I love Zapier. It’s great for automating whatever you can do by shunting around webhook payloads, and will shortcut a ton of toil. You can also use code blocks and run your own Lambdas, which can turn into powerful, if extremely fragile solutions to extremely specific problems – not unlike trying to turn your sedan into a street sweeper. It might work for a bit, but it comes with (potentially complicated) debt.

So what do?

Genuinely, access is an engineering problem that wants an engineering solution, and for a best-fit solution for audited, temporary access, you’d be best off rolling your own… if it weren’t for two things:

  1. As discussed above, it’s costly – not just the raw time and maintenance, but the opportunity cost of what else you’re not solving on what is (in my experience), an already-overloaded platform roadmap.
  2. Even if you do roll your own, 80%+ of your development work will be spent building out primitive building blocks: third party integrations, Slack UI, things of that nature.

That leaves only a fraction of your precious, fleeting engineering bandwidth to encode the rules, automations, and specifics for how your org wants to manage access – in other words, to actually build the bits for which you spent all that time gathering requirements.

Our goal as a vendor, then, is to help your platform team build something that’s just as specific as something you’d roll from scratch, but as fast and easy to change and mold as off-the-shelf software.

What does a great solution look like?

In the same way that my childhood landspeeder benefitted from a couple of purpose-built parts, a great technical solution for temporary access will benefit from some purpose-built components. For example:

  • A basic state machine for escalating, timing, and de-escalating privilege
  • Integrations that can leverage the state machine for a variety of systems
  • Some kind of UI for requesting and approving
  • Some kind of output to satisfy audit and compliance requirements

From there, the quality of a solution comes down to its configurability, maintainability, and overall operability. In other words, where it falls on our quadrant graph, and how well it sets you up for the future. Of course, it’s impossible to precisely predict tomorrow’s problems. The thing is, tomorrow’s problems – whatever they are – will happen.

If you’ve gone 100% build, then your ability to react to the future will come down to a huge number of unknowns:

  • How maintainable is your solution?
  • Is the (probably one-person) team who built it on PTO?
  • What else is on the platform roadmap, anyway?
  • Wait, Billy left the company? I thought he was just on PTO?

And so forth.

If you’ve bought something off the shelf (maybe it has a fancy UI!), you’re looking at the opposite problem: while the new features will come for “free,” you’re entirely at the mercy of someone else’s roadmap. Plus (and this is just the backend-happy PM in me coming out), the fancier the UI, the longer you’re going to have to wait for that feature.

Build vs. buy?

For temporary access – and for platform enablement in general – the answer has to be “both.” Find vendors that help you shoot the middle by giving you enough building blocks that you’re always in the best position to quickly build the last mile of the solution to tomorrow’s problem.

This is the puzzle we obsess over. Are we providing the right primitives to support temporary access workflows? Does our SDK have the right hooks to facilitate common (and not-so-common) business logic? What’s the just-right mix of static and dynamic components to balance safety, speed, effort, and efficiency?

To me, the best sign that we’re doing something right is that our customers are constantly surprising us by solving problems we didn’t design for – in other words, solving their own tomorrow-problems with the tools we’re building today.

Recommended Posts