If you have ever been responsible for distributing internal permissions in a secure data system, you probably felt that, at least sometimes, it was an incredible chore.
Access management can be compared to washing dishes — nobody really wants to scrub pans (update outdated access grants), but as long as tasty meals (reliable products) keep coming out of the kitchen, nobody will notice the pile of dishes. Particularly at small companies that have been endowed by investors to move fast and break things, the tendency is to simply throw dirty access dishes in the garbage and replace them with new ones (do a poor job of managing access and instead build out new products with strange and exciting new permissions for attackers to exploit).
This tendency is easy to sympathize with! Adequate security provisions don’t immediately show up as revenue or market growth, so they rarely attract the same attention as a new feature, particularly in the early stages of a company. Commonly, the oversight made in a fast-growing company is over granting permissions. As new features are pushed, new employees added, and new internal tools deployed, granting access liberally becomes much easier and faster than requiring access to be requested, reviewed, and approved.
Our team at Sym has seen this pattern play out many times, and it is part of the reason we set out to create a product that would remedy too-generous permissions without limiting downstream production. However, as we worked on security for clients big and small, we started to notice that there are a number of common pitfalls that go with access control.
We wanted to share our insight on overly permissive access, how it occurs, and how to avoid it. This post is a summary of our understanding of the problem, particularly as it relates to a cloud provider like AWS, and we hope it helps you grow as a company without sacrificing security.
One of the most important concepts in modern security is the transition from building secure systems to configuring secure systems. As highly integrated cloud systems become ever more dominant, the problem of good security is less about constructing your own gates and drawbridges and more about understanding a dashboard of complex controls and settings for a pre-fab castle.
Just as provisioning servers has been transformed from an exercise in DevOps tinkering to “infrastructure as code,” security has also become “code.” Whether you are managing an AWS IAM permission scheme via json files, or an internal rules set to kick permissions through a ticketing system, it's common to encounter this transition today.
But the development of “security as code” creates a new practical challenge: understanding the complex underlying security options of something as sprawling and complex as AWS.
Anyone who has had the unhappy task of swimming through the AWS (or other major cloud provider) docs to diagnose a vague permissions error can attest to the difficulty of understanding security settings that are generalized across hundreds of services and use cases. The configurations and documentation that govern these services are undergoing constant renovation and ultimately amount to their own complex and dynamic language.
This complexity leads to a suite of specific problems. “Sometimes I am astonished by the borderline genius behind AWS’s design,” says Ben Thomas, our CISO at Sym. “But it is simply so complex that even experienced security administrators can end up cutting corners to get a permission functioning properly.”
It can help to understand the common pitfalls of AWS implementation in order to avoid them. What follows is a brief taxonomy of common permissions-related problems we see in the AWS ecosystem.
As a SaaS company, we understand the importance of offering easy integrations that fit with existing services and launch your product forward. However, integrations with third-party tools inevitably require permissions to function, and quick fixes to enable the proper function of such integrations can cause access creep.
We see time and time again that a frustrated engineer has kludged a makeshift permissions setting for some third-party service that didn’t quite fit into existing schemas. Maybe the read/write role that you had to make for the CDP you no longer use is mothballed but not disabled, or maybe in provisioning a tool for HR, employee information is made slightly too available. These can leave exploitable gaps for bad actors and generate security tailings — little pieces of security settings that no one remembers or understands.
When constantly bringing new engineers into the company or spinning up a new project, administrators can get a sort of mental repetitive stress injury — and to alleviate the frustration, they might cut corners. Whether new workers are internal transfers or newly hired, they will need access to codebases, datasets, and infrastructure configurations to work on the project. When repeatedly delivering these to project workers, managers and admins have a tendency to simply dump all possible necessary permissions on newbies to avoid having to repeatedly enable access.
Whether it is workers upgrading their roles, contractors coming and going, or even the aforementioned third-party tools being swapped out, residual permissions cause big problems. It is a natural instinct to attach permissions to an individual agent — employee, tool, or background process — rather than to a role. It’s an even more natural instinct to forget to remove that permission when the agent changes roles or leaves the team. This leads to exploits and general sorrow.
Most engineers already understand these mistakes and recognize when they are making them. But when doing security right means something like “staying up to date with AWS security features,” it just makes more sense to do a shoddy job of access control and get the job done.
That’s why we recommend that our clients rethink the way they manage access management.
Sometimes the difference between a permission that is required to do a job and one that is “overly” permissive is not so obvious. Often, the discussion around overly permissive access relies on the idea of the principle of least privilege (PoLP). PoLP holds that every account and agent in a system should have precisely the permissions that allow them to function and no others. It is a good heuristic, but we find that PoLP can be limiting in the modern context of access management.
The main problem with trying to implement an absolutely sterile system of the minimum possible privileges is that a functioning production system will inevitably have some dynamic requirements that create weak points.
“In order to eliminate risk only on the basis of access,” says Ben, “I could create a totally safe system: remove all permissions from all other accounts on our AWS, generate a random password for my own account, and then delete my password manager database.”
Such a system might not have anything that could be described as “overly” permissive, but Ben notes that his hypothetical “might not be the best balance between production and safety.” The simple fact is we can’t go around throwing away the key — requirements for a system change, and you’d need a level of omnipotence to ensure that the system can serve changing functions.
Therefore, we like to use a different concept: appropriate access. “Appropriate” is also a charged and powerful word – it is a term of art for security auditors, who use it to indicate that, even if an access level is slightly more than needed, it is still a responsible implementation for the context of the access event.
By simply acknowledging that there will always be somewhat flexible access points in a system, we can start to think about different types of solutions to access control.
It’s easy to wave our hand at the ways that privileges leak data and suggest these should be avoided. However, it can be hard to draw a clear line between physical analogies, like a private key, a firewall, or an access portal. But focusing on the barriers to access in a sort of physical sense can cause us to miss a large part of the risk profile in access management. The duration of access is nearly as important as the breadth of the access itself. It isn’t a big problem for a worker to have access to a database relevant to their jobs — the issue occurs when that access persists past the productive window.
Controlling privilege duration can sometimes be overlooked as an element of good security. There are certain commonly understood practices — deleting permissions and accounts when a user is no longer employed or requiring password changes after a certain time, for example. However, most implementations of time-based permissions controls don’t use the full potential of temporal access controls to achieve appropriate permissions.
Some of the common issues discussed earlier can be outright solved by adding strict time conditions to access. It is impossible to have permission creep for an employee promoted to a different team if their permissions were provisioned for twenty-four-hour periods by their previous manager.
Other risks are significantly softened when the window for access is minimized. Individual accounts might be compromised by a bad actor, but if they need to apply for a temporary escalation in privileges in a suspicious way, a manager has an opportunity to catch the intrusion.
In a sense, by disposing of your privileges immediately after using them, the problem of the dirty dishes is solved by guests immediately bussing their own utensils. Any messiness is resolved quickly and automatically, and a general environment of very few permissions can persist.
At Sym, we are intimately familiar with trying to manage chaotic tangles of AWS permissions. We built our product to prevent overly permissive access and easily deliver well-founded principles of security to clients of AWS without their needing to carefully manage a thousand JSON configs.
If you are a responsible security officer interested in our Slack-integrated solution for timed, logged, and managed AWS permissions, you probably want to see some proof of our bonafides: