The Authorization Game

By
Jon Bass
February 8, 2023

I’m going to argue that access management is one of the most pressing issues facing developers today, and that the hodge podge of UIs and JIRA tickets that we’re using for it opens us up to serious vulnerabilities. At the risk of some major hubris, I will pattern my argument after The Imitation Game. This is the game from the seminal “Turing Test” paper where Alan Turing proposes a way to test for artificial intelligence.

So many words… I just want to try this thing out

Fine, fine. Our team has created an amazing guided onboarding. You’ll need a Slack workspace where you can install our app, Terraform, and a dream. Let us know how it goes!

1. The Authorization Game

Alan Turing, father of DevSecOps

I propose to answer the question, “Can I get access to this thing?” This should begin with definitions of “I” and “this thing”. “I” am a human or machine that is trying to get work done at a cloud-friendly organization. A cloud-friendly organization is most likely where you, reader, are working today. Your business mostly runs on AWS, GCP, or Azure, and you depend on tons of SaaS services and internal tools to make things happen. “This thing” that you want to access is one of the resources in this web of cloud, SaaS, and internal endpoints.

Let’s replace the question “Can I get access to this thing?” with another. The new form of the problem can be described in terms of a game. It is played by three entities, a requester (A), a resource (B), and an owner (C). In the simplest form of the game, A and C are humans sitting in the same room together, and B is some digital resource that C has both domain expertise about AND has admin privileges to.

In order for C to “win”, they need to decide whether A should get access to B. In the simple form of the game, C can simply ask A why they need B. Since C has all the domain knowledge required, they can decide if A’s reason is sufficient. Since C also has admin privileges to B, they can grant A access. Since A told C why they need access, C knows when they can remove A’s access as well.

We now ask the question, “What will happen when a machine takes the part of C in this game?” Will this machine still be able to make the right decision about whether to give A access to B? Will this machine know how to actually grant A the access that is required? These questions replace our original, “Can I get access to this thing?”

2. Critique of the Question

Why not just hire a captive workforce and allow no new access grants at all?

Is “Can I have access to this thing?” really that important a question? When Turing asked “Can machines think?,” that was and remains obviously worth thinking about. OK, granted. Access management friction is perhaps not an existential problem… but it is a fundamental issue for teams that are trying to get work done within the constraints of modern business.

We’re losing the authorization game, and it is putting our businesses at risk from both a security and efficiency perspective. People who have overprovisioned access are breaking things. A script at Atlassian accidentally cascade-deleted tons of customer data. A compromised engineering account at Uber breached tons of sensitive access. If we try to manage these risks, we typically end up grinding work to a halt. Teams are stuck waiting for access, or are forced to create workarounds that in turn lead to new security and efficiency issues.

We cannot play the simple version of the authorization game and also scale our businesses. We need a machine to replace an omnipotent and omniscient resource owner (C). A properly designed machine can account for all the inputs and context we’ll need to actually make authorization decisions at scale. Our new question improves on “Can I have access to this thing” because we’re not all in the same room, basically ever. We don’t have all the context and all the access.

How did we end up in this mess?

This is less complicated than what you're doing for access management today

(1) We’re using a lot of services

The surface area of AWS alone in your organization easily consists of hundreds of combinations of discrete services and accounts. Layer in GitHub or GitLab, Okta or OneLogin, CloudFlare, DataDog or SumoLogic, StatusPage, Sentry, Segment, Airtable, JIRA, Slack, CircleCI, Notion, Ngrok, PagerDuty, Terraform Cloud or Spaceship, Honeycomb…. You get the idea. There are a lot of services to consider here.

https://twitter.com/tmrohan/status/1620211326000955392

(2) The ways to grant access to services vary widely

The complexity of authorizing access is both technical and organizational. From a technical perspective, there is no standard way to implement authorization controls across all the services that modern teams need to manage. Some services have complex RBAC systems, some have a few roles, some have easy to use APIs, some use SAML or OIDC, some require creative workarounds to achieve your desired authorization outcomes.

(3) No one has all the domain knowledge about the services we’re using

Finding all your domain experts and getting a slice of their time ain’t easy. In a single “service” like AWS, the expertise to decide how to set up access to DynamoDB vs S3 vs SageMaker vs EKS might be wildly different. AWS is just one of many services you’re likely depending on, each with their own quirks and internal usage patterns that needs to be considered when making an access management decision. There’s no central oracle with the knowledge of how authorization should be set up in each service.

(4) Authorization decisions for services should be context-based

The kicker is that even if you are lucky enough to identify all the domain experts you need to make good access management decisions, you’re still kind of out of luck. When you ask one of your domain experts “Can I get access to this thing?” they’re not going to say “Sure!”. What they’ll say is: “Well… it depends”.

That’s because access management decisions in a modern business are dynamic, context-dependent things. Your access rights depend not just upon who you are, but on the full complement of what, when, where, why, and how you’re doing something. If you’re investigating a customer issue, then maybe elevated data access makes sense… for a time. If you’re standing up a new application with new AWS service dependencies, maybe you need more sensitive AWS permissions… until the application stabilizes.

3. The Machines Concerned in the Game

The question which we put in 1 will not be quite definite until we have specified what we mean by the word “machine”. In short, it’s a system that (for now) combines human and machine inputs to automate access management decisions. You ask the machine for access, and it decides if, when, and how you get it.

In order for Access Management Systems to win the game, they’ll need to address all the problems we highlighted in section 2. They will need to:

  1. Identify the right system(s) to solve a given business problem. Since implementations are spread across so many systems, it might not be obvious that in order to triage a customer success issue, a requester needs access to a few S3 buckets, some data in Snowflake, and an internal admin dashboard.
  2. Find the right domain experts to decide if the access request is appropriate, and request their approval. The people that understand your S3 resources vs your Snowflake layout vs your internal admin tools could easily be pretty different.
  3. Pull in the right context so that domain experts have the information they need to approve the request. We should assume that the domain experts do not know the requester or why they need access. Context might come from even more systems, like the identity provider that says who the requester is, a ticketing system, or an incident response platform like PagerDuty.
  4. Actually be able to grant and revoke access requests. This might seem like weird coupling… wouldn’t it be cleaner if our system just focused on decision making and not on actually trying to integrate to make all the access changes? The problem with too loose of a coupling is that in order to win the game, we need to grant the requester access quickly. Our system needs to have the hooks to either grant and revoke access itself, or enable dependent systems to quickly make these changes themselves.
  5. Report on the status of access decisions. To maintain and improve our system, we need to know what decisions were made and why they were made. That data should also be exportable to other systems in real time.

And if that’s not enough, we need all of the above to be at least partially (ideally completely) automated. We won’t be able to build either a secure or efficient access management system if a human has to do each of these steps every time. We’ll overload people with requests and context switching, resulting in slower work and poor decisions.

4. Digital Computers

Cool hackers are available to help you remove access management friction

The best way for an access management system to win the authorization game is if teams can manage the system in code. When you use code to define your access workflows, you represent not only the happy path for access, but also the edge cases and nuances that are specific to your organization. Using code to wire in our integrations means we can meet target systems where they are. You define infrastructure in code because it is cheaper to build, easier to maintain, and more resilient to change. If the tools are there to provision your access management system in code, shouldn’t you do that, too?

5. Sym: A Dynamic, Adaptive Authorization Machine

Provision in Terraform, route in Python, approve with context in Slack

Sym is an access management system that can help you win the authorization game. Sym gives teams a platform that is opinionated enough for fast development, while also extensible enough to integrate into the specific needs of your environment. The key building blocks you get with Sym are:

  • An infrastructure-as-code foundation that lets you use Terraform to provision and change manage everything. We’ve got a Terraform provider that lets you declare flows, along with a growing family of connector modules that help you manage all our core integration dependencies right along with the rest of your infrastructure.
  • A workflow engine that is specifically designed for approval and access. All Sym Flows use the same pattern of workflow steps. By building on one workflow pattern, we can make abstractions in the UI, routing, and reporting layers that make configuration and maintenance efficient.
  • A user interface in Slack to make and approve access requests. An API and the concept of “channels” to let us extend this interface to more places. (We’re working on it!)
  • An SDK that makes it easy to do the easy things, and possible to do the hard ones. We’ve got examples to get you going quickly for common patterns with AWS, Okta, and more. We also let you open the whole Swiss Army Knife when you need to, with Lambda and Custom Integrations.
  • Flexible reporting that integrates with wherever you store your logs. We send rich, structured event logs to Segment or to AWS Kinesis.

Sym’s platform starts with human-in-the-loop workflows because we want you to have an access management system that doesn’t need workarounds. By starting at the human review layer, we ensure domain experts make access decisions when necessary, and give them the right context to do so.

Customers integrate with PagerDuty to adapt their workflows to on-call status

As your usage of Sym grows, we help you codify and automate more and more of your access decisions. We know that it is not practical for humans to review every access request, and that is why we’ve built adaptivity into our flows. Our PagerDuty integration, for example, lets you adapt access requirements to the requester’s on-call status. You can use our SDK hooks to wire in more adaptive controls, and we’ll be building more in this area over the year. (Talk to Max).

6. Contrary Views on the Main Question

Benedict Cumberbatch plays Dr Strange as well as Alan Turing, and he looks kind of contrary here

Turing’s paper lays out many objections to the question of artificial intelligence, including arguments from theology, math, and extrasensory perception. I haven’t encountered theological objections to our efforts to improve access management systems. I do think it is worth calling out, however, how Sym’s take on the authorization game fits into the wider landscape.

Sym helps you win the authorization game because we help you integrate context and actions from across your tech stack. Identity management systems like Okta or AzureAD do a great job of asserting who you are, but get stuck when trying to define what you can do. Rego and Cedar let you express authorization logic in elegant languages, but don’t help you collect all the inputs and outputs you need to make your access management system function. No-Code tools help you get started quickly, but fall flat when it comes to change management and covering edge cases.

7. Learning Machines

While Sym still has plenty of room to grow, we’re making it generally available because of what we’ve seen our customers already achieve with it. Orum uses Sym to manage temporary access to Databricks and AWS, as well as for conditional deployments to CircleCI. Courier and many other customers use Sym to manage access to internal admin tools and to AWS. Other customers use Sym to safeguard access to sensitive HIPAA-compliant environments in Aptible.

Try Sym and get started building your own access management machine! Teams are struggling to work safely and efficiently because they need access management systems that are dynamic and adaptive. We need tooling that identifies domain experts, gives them the right context to make access decisions, and codifies the process so your businesses can scale.

Sym gives you the building blocks to win the authorization game, and we're excited to see how you use them. As Turing concluded, “we can only see a short distance ahead, but we can see plenty there that needs to be done”.

It turns out that the framing of Turing’s paper, along with the section headings (and even some of the topic sentences) worked well for the Sym case. At least it was fun for me to write it this way, and if it makes you go back and read his paper, even better. Thanks to Myles Steinhauser for having breakfast with me last week and inspiring this ridiculous idea.

Recommended Posts