Getting to the right balance of security and convenience is a game of tradeoffs. So it is with managing access to EC2 instances. Stricter security can lead to friction for engineers as they deploy and manage the applications they’re responsible for. More lenient security prioritizes speed, but correlates with a higher risk of exploits.
The way you approach access management is as important for your administrators as it is for your developers. Rolling out a complex access management solution increases the cognitive load on administrators. Complexity causes people to make mistakes and overlook vulnerabilities. An access management system must be easy for developers to quickly navigate, hard for hackers to exploit, and reasonable for administrators to administer.
The goal of this article is to explore EC2 access management strategies that offer different tradeoffs on the security and convenience curve.
If you’ve ever set up an EC2 instance, you were presented with the option to create an SSH key pair. It is the default way to gain access to a Linux EC2 instance and many people follow the happy path to meet their needs.
A key pair can be generated by AWS or on your local machine and then attached to an instance.
Using an EC2 key pair may be easy, but it is also immediately in violation of one of the AWS Foundational Security Best Practices. To support direct SSH access, your instance must have an IP listed on the public internet. This conflicts with a high-severity control in AWS Security Hub that flags instances with public IPs.
A second security risk is created as soon as you save the private key. If a developer’s workstation is compromised, SSH private keys that allow access to EC2 could allow an attacker to move laterally from the developers workstation to their organization’s cloud infrastructure. This is a familiar scenario to anyone who enjoys reading any of the widely available security incident write-ups.
There are tools and techniques available to mitigate these risks. EC2 Security Groups allow you to limit access to your instance to a defined safelist of IP addresses. A better option would be to use a VPC. Put your EC2 instances in a private subnet and enable access via a bastion host in a public subnet. This way you only have a public IP address on one host to worry about.
This still leaves private keys as a vulnerability. Privileged Access Management (PAM) solutions via vendors like OneLogin or Duo Security can help by enforcing username and password (and possibly a multi-factor authentication via texts of an app like Authy) all on top of the security provided by SSH keys. PAM solutions also allow you to centrally manage access levels for your users rather than maintaining permissions on each instance.
If you want to get out of the key and user management business for your EC2 instance access, take a look at offerings from Teleport, StrongDM, or Tailscale. These all offer ways to SSH into boxes with various controls and guardrails built-in. Tailscale’s SSH product was just announced in July and provides a low-friction way to give users SSH access. Tailscale’s SSH is built on their mesh VPN technology which lets you connect from one device to any other, including your production servers, directly. Plus, Tailscale integrates with Sym!
If you’re not ready to bring in a 3rd party vendor, AWS has its own solution for SSH access. You are probably managing access to other AWS components using AWS IAM… so why not manage access to EC2 instances via IAM? Fortunately AWS Systems Manager Session Manager (SSM) allows you to do that.
With SSM, you can create IAM policies that allow users to SSH to a scoped set of EC2 instances in your AWS accounts. You can constrain access by EC2 instance tag or to specific instance IDs.
SSM removes the need to maintain a public bastion host in your infrastructure. SSM relies on an agent that runs on your EC2 instances to connect to the SSM API. Your instances only need to have outbound access to these APIs (via a NAT or a VPC endpoint) in order to be reachable by SSH. The instances can be in a private VPC with no inbound reachability from the public internet!
Another feature afforded by SSM is its integration with AWS CloudTrail. CloudTrail will record all EC2 accesses in a collected, searchable place. SSM + Cloudtrail is an excellent way to monitor historical access to your infrastructure for general-purpose scrutiny.
So far we’ve looked at ways to enable SSH access… but what if you could engineer your way out of needing to provide this access at all? Today, there are various techniques to minimize SSH access. This category is sometimes referred to as GitOps—automating routine tasks on servers, triggered automatically or via webhooks. Under this paradigm, only GitOps-focused engineers will have access to alter the GitOps processes.
You can utilize SSM’s Run Command feature to carry-out custom tasks without requiring an engineer to have full access. This workflow dramatically minimizes the damage an attacker can do if they compromise an engineer’s credentials to a regimented IAM policy that interfaces with tightly-defined SSM commands.
Even with a strong automation practice in place, it is hard to avoid needing SSH access at all. Stuff happens — new services get stood up, new technical hurdles emerge, you inherit management of a legacy product, and all of the sudden you need to get on an EC2 box again.
A safe approach to managing ongoing SSH access requirements is to roll out a just in time (JIT) access policy. Making it easy for developers to get the access they need allows them to do their job; but if you make it too easy, you risk giving an attacker the ability to move laterally in your organization if a developer’s workstation is compromised.
With a JIT approach, you remove default access for risky actions and enable them on an as-needed basis. To implement JIT with SSM, you can organize your SSM-related IAM policies and Roles based on risk levels. Then assign risky Roles to designated IAM Groups that you grant and revoke access to as needed.
Setting up the Roles, Groups and processes to support JIT reveals a major problem with modern access management systems, including AWS IAM: they were designed for a world of static access policies. Moving people in and out of Groups is toilsome ClickOps, not to mention error prone. It is no wonder many organizations hold out with overly permissive policies for as long as they can.
To get around the toil of manual JIT processes, companies often decide to roll their own solution. They build a UI or Slack bot for making requests alongside support for auditing everything. Before they know it, they end up building a complex system with more maintenance overhead than they bargained for.
If you want to avoid the maintenance burden of a home-grown JIT solution, you should consider a platform like Sym. Sym provides a managed service for orchestrating just in time access along with approvals, auditing, and lots of integrations!
Sometimes your team may purchase a tool that has programmatic access to your AWS account. Unfortunately, that means that if your vendor suffers an exploit, they may pose a risk to your own security.
To minimize this threat, utilize AWS’s IAM Access Analyzer to scrutinize 3rd Party access rights. Ideally, a 3rd party platform should strictly only have access to necessary actions that are needed to carry-out its service.
EC2 instance Security exists along a sliding scale. SSH is available by default but we recommend implementing an IAM based solution from the start. You can progressively add guardrails as your organization scales. AWS IAM allows you to centralize roles and policies across all AWS services. Pretty early in your maturity journey, change management will pop up and have the potential to cause toilsome processes for security and developers. Therefore, we also recommend investing in automation (via a managed product like Sym) early in your journey. This will allow you to progressively add checks and balances as your team and risks grow.