Zero Trust in AWS

Amazon Web Services operates on a shared responsibility model where AWS secures the infrastructure, but customers own the security of everything they deploy on top of it. This distinction is critical…

Zero Trust in AWS - zero trust in aws

Why Zero Trust Matters in AWS

Amazon Web Services operates on a shared responsibility model where AWS secures the infrastructure, but customers own the security of everything they deploy on top of it. This distinction is critical because traditional perimeter-based security falls apart in a cloud environment where resources are ephemeral, APIs are the primary access vector, and network boundaries are fluid. Zero Trust in AWS means assuming that no principal, whether a human user, an EC2 instance, or a Lambda function, is inherently trusted, regardless of where the request originates.

AWS provides a rich set of primitives that align naturally with Zero Trust principles, but assembling them into a coherent architecture requires deliberate design. The default posture for most AWS accounts is far too permissive: IAM policies with wildcard actions, security groups allowing broad ingress, and S3 buckets with overly generous access. Moving to Zero Trust requires systematically tightening every layer.

IAM as the Foundation of Zero Trust

AWS Identity and Access Management is the backbone of any Zero Trust implementation on the platform. Every API call in AWS is authenticated and authorized through IAM, making it the single most important control plane to harden. The principle of least privilege must be enforced ruthlessly, starting with the elimination of long-lived access keys in favor of IAM Roles and temporary credentials via AWS Security Token Service (STS).

IAM policies should be scoped using condition keys that go far beyond simple action and resource constraints. For example, the aws:SourceVpc condition restricts API calls to originate from a specific VPC, aws:PrincipalOrgID ensures only principals within your AWS Organization can access resources, and aws:MultiFactorAuthPresent enforces MFA for sensitive operations. Combining these conditions creates policies that enforce contextual access decisions at the API level.

Service Control Policies (SCPs) at the AWS Organizations level provide guardrails that cannot be overridden by individual account administrators. A Zero Trust SCP strategy typically includes denying the use of root credentials, restricting regions where resources can be deployed, preventing the disabling of CloudTrail or GuardDuty, and requiring encryption on all storage services. These policies act as a non-negotiable security baseline across all accounts.

  • Eliminate long-lived IAM access keys; use IAM Roles with STS AssumeRole for all programmatic access
  • Implement permission boundaries to cap the maximum privileges any role can possess
  • Use aws:RequestedRegion condition keys to prevent resource creation in unauthorized regions
  • Enable IAM Access Analyzer to continuously identify resources shared with external entities
  • Deploy SCPs to deny iam:CreateUser and iam:CreateAccessKey in workload accounts

Network Micro-Segmentation with VPC Design

In a Zero Trust AWS architecture, the VPC is not a trust boundary but rather a segmentation tool. Each workload tier should reside in its own subnet with security groups and Network ACLs enforcing strict ingress and egress rules. Security groups should reference other security groups rather than CIDR blocks wherever possible, creating identity-aware network policies that follow the workload rather than depending on IP addresses.

AWS PrivateLink eliminates the need for traffic to traverse the public internet when accessing AWS services or exposing your own services to other VPCs. Instead of opening NAT Gateway routes to reach S3, DynamoDB, or SQS, VPC endpoints bring the service interface directly into your VPC. Gateway endpoints for S3 and DynamoDB are free and should be deployed in every VPC. Interface endpoints use Elastic Network Interfaces with private IP addresses and can be secured with endpoint policies that restrict which principals and actions are allowed.

For east-west traffic between microservices, AWS App Mesh with Envoy proxies provides mutual TLS authentication and fine-grained traffic routing. Every service-to-service communication channel is encrypted and authenticated, with the mesh controlling which services can communicate with each other. This eliminates implicit trust between services running in the same VPC or even the same subnet.

Continuous Verification with AWS Native Services

Zero Trust demands continuous monitoring and verification, not just point-in-time authentication checks. AWS CloudTrail records every API call across your organization, creating an immutable audit trail. CloudTrail should be configured as an organization trail, logging management events and data events for critical services like S3, Lambda, and DynamoDB. Trail logs should be delivered to a centralized security account with an S3 bucket policy that prevents deletion or modification.

Amazon GuardDuty uses machine learning to detect anomalous behavior across CloudTrail logs, VPC Flow Logs, and DNS query logs. It identifies compromised credentials by detecting API calls from unusual locations, reconnaissance activities like port scanning, and data exfiltration patterns. GuardDuty findings should trigger automated remediation through EventBridge rules that invoke Step Functions workflows to isolate compromised resources, rotate credentials, and notify security teams.

AWS Config rules continuously evaluate resource configurations against your desired state. Custom Config rules can enforce Zero Trust policies such as ensuring all EBS volumes are encrypted with customer-managed KMS keys, verifying that no security group allows 0.0.0.0/0 ingress, and confirming that all S3 buckets have public access blocked. Non-compliant resources trigger automated remediation through Systems Manager Automation documents.

Data Protection and Encryption Strategy

In a Zero Trust model, data protection extends beyond encryption at rest and in transit. AWS Key Management Service (KMS) should be the sole mechanism for managing encryption keys, with key policies that enforce separation of duties: the key administrator can manage key lifecycle but cannot use the key for encryption, and the key user can encrypt and decrypt but cannot modify the key policy. This prevents any single principal from both controlling and using encryption keys.

S3 bucket policies should enforce server-side encryption with aws:kms and deny any PutObject requests that do not include the x-amz-server-side-encryption header. S3 Object Lock in governance or compliance mode prevents objects from being deleted or overwritten for a specified retention period, protecting against ransomware and insider threats. Macie should be enabled to continuously scan S3 buckets for sensitive data like PII, credentials, and financial information that may have been stored without proper classification.

  • Use KMS key policies with grants rather than broad IAM policies for encryption key access
  • Enable S3 Block Public Access at the account level, not just the bucket level
  • Enforce TLS 1.2 minimum on all service endpoints using aws:SecureTransport conditions
  • Rotate KMS keys annually and use separate keys per environment (dev, staging, production)

Implementing Workload Identity with IRSA and EKS

For organizations running Kubernetes on Amazon EKS, IAM Roles for Service Accounts (IRSA) is the Zero Trust mechanism for granting AWS API access to pods. IRSA uses OpenID Connect federation to map Kubernetes service accounts to IAM roles, eliminating the need for node-level IAM roles that grant identical permissions to every pod on the node. Each pod receives temporary credentials scoped to its specific service account and namespace.

The trust policy on the IAM role should include conditions for both the OIDC provider and the specific service account name, formatted as system:serviceaccount:namespace:service-account-name. This ensures that even if an attacker compromises one pod, they cannot assume roles assigned to other service accounts. Combined with Kubernetes NetworkPolicies that restrict pod-to-pod communication, IRSA creates a defense-in-depth model where both the network and identity layers enforce segmentation.

EKS Pod Identity, the newer alternative to IRSA, simplifies the setup by eliminating the need to manage OIDC providers and annotate service accounts. It provides the same security guarantees with less operational overhead. Regardless of which mechanism you choose, the critical principle remains the same: every workload receives only the permissions it needs, authenticated through cryptographic identity rather than network position.

Operationalizing Zero Trust in AWS at Scale

Implementing Zero Trust across a multi-account AWS environment requires organizational scaffolding. AWS Control Tower provides a landing zone with guardrails that enforce baseline security controls. The recommended account structure separates workloads into organizational units: a Security OU for centralized logging and security tooling, a Sandbox OU for experimentation with restrictive SCPs, and Production and Non-Production OUs with environment-specific controls.

AWS Systems Manager Session Manager replaces SSH and RDP access to instances, eliminating the need for bastion hosts and open inbound ports. Session Manager sessions are authenticated through IAM, logged to CloudTrail, and can be restricted with IAM policies that specify which instances a user can connect to and whether they can start port forwarding sessions. This transforms instance access from a network-based control to an identity-based control, fully aligned with Zero Trust principles.

The journey to Zero Trust in AWS is iterative. Start by enabling CloudTrail and GuardDuty across all accounts, then progressively tighten IAM policies using Access Analyzer findings. Replace long-lived credentials with roles, deploy VPC endpoints for AWS service access, and instrument your applications with X-Ray for distributed tracing. Each step reduces the implicit trust in your environment and moves you closer to a posture where every request is verified, every session is authenticated, and every action is logged.