Workload Identity in Kubernetes

Kubernetes workloads present a unique identity challenge. Pods are ephemeral, IP addresses are dynamically assigned and recycled, and containers can be rescheduled across nodes at any time….

Workload Identity in Kubernetes - workload identity in kubernetes

The Identity Challenge in Kubernetes

Kubernetes workloads present a unique identity challenge. Pods are ephemeral, IP addresses are dynamically assigned and recycled, and containers can be rescheduled across nodes at any time. Traditional identity mechanisms that rely on static IP addresses, hostnames, or long-lived credentials are fundamentally incompatible with this dynamic environment. Workload identity provides a cryptographic identity framework that assigns verifiable identities to workloads based on their Kubernetes attributes rather than their network location.

Without workload identity, Kubernetes services typically authenticate to external resources using shared secrets stored in Kubernetes Secrets objects. These secrets are often long-lived, shared across multiple pods, and difficult to rotate without downtime. If a single pod is compromised, the attacker gains access to every resource the shared secret authorizes. Workload identity eliminates this pattern by providing each workload with its own short-lived, automatically rotated credential that is cryptographically bound to the workload’s verified identity.

Kubernetes Service Accounts and Their Limitations

Kubernetes Service Accounts are the native mechanism for workload identity within the cluster. Every pod runs under a service account, and since Kubernetes 1.24, the TokenRequest API issues short-lived, audience-scoped tokens for these accounts. The projected service account token, mounted at /var/run/secrets/kubernetes.io/serviceaccount/token, is a JWT signed by the cluster’s service account issuer.

A projected service account token contains claims that identify the workload:

{
  "aud": ["https://vault.company.com"],
  "exp": 1709316000,
  "iat": 1709312400,
  "iss": "https://kubernetes.default.svc.cluster.local",
  "kubernetes.io": {
    "namespace": "production",
    "pod": {
      "name": "order-service-7b9f8c6d4-x2kfm",
      "uid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
    },
    "serviceaccount": {
      "name": "order-service",
      "uid": "f1e2d3c4-b5a6-7890-abcd-ef0987654321"
    }
  },
  "nbf": 1709312400,
  "sub": "system:serviceaccount:production:order-service"
}

This token has a one-hour lifetime, is scoped to the Vault audience, and identifies the specific pod, namespace, and service account. The receiving service (Vault) can verify the token by calling the Kubernetes TokenReview API or by validating the JWT signature against the cluster’s OIDC discovery endpoint.

However, Kubernetes service account tokens are limited to intra-cluster authentication and OIDC-compatible external services. They do not provide identity for cloud provider APIs, third-party SaaS services, or cross-cluster communication without additional bridging mechanisms.

Cloud Provider Workload Identity Federation

Cloud providers have developed workload identity federation mechanisms that bridge Kubernetes service accounts to cloud IAM identities, eliminating the need for static cloud credentials inside the cluster. Each major cloud provider implements this differently, but the underlying principle is the same: the Kubernetes service account token is exchanged for a cloud-native credential through an OIDC trust relationship.

GKE Workload Identity

Google Kubernetes Engine (GKE) Workload Identity maps Kubernetes service accounts to Google Cloud IAM service accounts. The GKE metadata server intercepts requests to the instance metadata endpoint and returns credentials for the mapped IAM service account. Configuration involves creating the mapping and annotating the Kubernetes service account:

# Create IAM service account
gcloud iam service-accounts create order-service-sa 
    --project=myproject 
    --display-name="Order Service"

# Grant necessary permissions
gcloud projects add-iam-policy-binding myproject 
    --member="serviceAccount:order-service-sa@myproject.iam.gserviceaccount.com" 
    --role="roles/cloudsql.client"

# Bind Kubernetes SA to IAM SA
gcloud iam service-accounts add-iam-policy-binding 
    order-service-sa@myproject.iam.gserviceaccount.com 
    --role="roles/iam.workloadIdentityUser" 
    --member="serviceAccount:myproject.svc.id.goog[production/order-service]"

# Annotate Kubernetes service account
kubectl annotate serviceaccount order-service 
    --namespace production 
    iam.gke.io/gcp-service-account=order-service-sa@myproject.iam.gserviceaccount.com

EKS IAM Roles for Service Accounts (IRSA)

Amazon EKS uses IAM Roles for Service Accounts (IRSA), which configures the EKS cluster as an OIDC identity provider in AWS IAM. Pods annotated with the appropriate IAM role receive temporary AWS credentials through the projected service account token, which the AWS SDK exchanges for STS credentials:

# Create IAM role with trust policy for the EKS OIDC provider
cat > trust-policy.json << 'POLICY'
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::111122223333:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:sub": "system:serviceaccount:production:order-service",
        "oidc.eks.us-west-2.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE:aud": "sts.amazonaws.com"
      }
    }
  }]
}
POLICY

aws iam create-role 
    --role-name order-service-role 
    --assume-role-policy-document file://trust-policy.json

# Annotate Kubernetes service account
kubectl annotate serviceaccount order-service 
    --namespace production 
    eks.amazonaws.com/role-arn=arn:aws:iam::111122223333:role/order-service-role

The trust policy's Condition block is critical: it restricts which Kubernetes service accounts can assume the IAM role. Without these conditions, any service account in the cluster could assume the role, defeating the purpose of workload identity.

SPIFFE for Platform-Agnostic Workload Identity

SPIFFE (Secure Production Identity Framework for Everyone) provides a platform-agnostic workload identity standard that works across Kubernetes, VMs, bare metal, and multi-cloud environments. SPIRE, the SPIFFE runtime, issues SVIDs (SPIFFE Verifiable Identity Documents) to workloads after performing attestation, which is the process of verifying that a workload is what it claims to be.

In Kubernetes, SPIRE uses the Kubernetes Attestor to verify workload identity through the Kubernetes API. The attestation process checks the pod's service account, namespace, node, and container image against registration entries. Only workloads matching a registered entry receive an SVID:

# Register a workload entry in SPIRE
spire-server entry create 
    -spiffeID spiffe://company.com/production/order-service 
    -parentID spiffe://company.com/spire/agent/k8s_psat/production-cluster/node-uuid 
    -selector k8s:ns:production 
    -selector k8s:sa:order-service 
    -selector k8s:container-image:registry.company.com/order-service:v2.3.1 
    -ttl 3600

# The workload receives an X.509 SVID with the SPIFFE ID as a URI SAN
# Certificate Subject Alternative Name:
# URI:spiffe://company.com/production/order-service

The container image selector is a powerful security feature. Even if an attacker compromises the order-service service account, they cannot receive a valid SVID unless they are running the exact expected container image. This prevents an attacker from deploying a malicious container under a legitimate service account.

Integrating Workload Identity with Secret Management

Workload identity becomes most powerful when integrated with secret management systems like HashiCorp Vault. Instead of storing database passwords, API keys, and encryption keys in Kubernetes Secrets, workloads authenticate to Vault using their workload identity and receive short-lived, dynamically generated credentials.

  • Vault Kubernetes Auth: The workload presents its Kubernetes service account token to Vault, which validates it against the cluster's TokenReview API. Vault then maps the service account to a Vault policy that grants access to specific secrets.
  • Dynamic database credentials: Instead of storing a static database password, the workload requests a dynamic credential from Vault. Vault creates a temporary database user with the minimum required permissions and a short TTL (typically 1 hour). When the TTL expires, Vault automatically revokes the credential.
  • PKI certificate issuance: Workloads request TLS certificates from Vault's PKI engine using their workload identity. Vault issues certificates with short lifetimes (24 hours) and the workload's SPIFFE ID as a SAN, enabling mTLS communication with other services.
  • Transit encryption: Workloads use Vault's Transit engine for application-level encryption and decryption of sensitive data. The encryption keys never leave Vault, and access to specific keys is controlled by Vault policies mapped to workload identities.

Security Hardening and Best Practices

Workload identity security depends on the integrity of the underlying platform. If an attacker can create arbitrary pods with arbitrary service accounts, they can obtain any workload identity in the cluster. Several hardening measures are essential to protect the identity issuance chain.

Kubernetes RBAC must restrict who can create pods and assign service accounts. Pod Security Admission (replacing the deprecated PodSecurityPolicy) should enforce baseline or restricted security profiles that prevent privilege escalation. Admission controllers such as OPA Gatekeeper or Kyverno should validate that pods reference only authorized container images and service accounts.

The node-level security is equally critical. If an attacker compromises a node, they can potentially access the service account tokens of all pods running on that node. Node hardening, runtime security tools like Falco, and node-level network policies help detect and prevent node compromise. In high-security environments, consider dedicating nodes to specific workload sensitivity levels, ensuring that a compromise of a node running low-sensitivity workloads does not expose high-sensitivity workload identities.

Token audience validation is a simple but frequently overlooked hardening measure. When a workload presents its service account token to an external service, the token should include an audience claim specific to that service. The receiving service must validate the audience claim, rejecting tokens not intended for it. This prevents a token issued for Vault authentication from being replayed against a different service that also trusts the cluster's OIDC issuer.

Workload identity in Kubernetes is the foundation upon which all other Zero Trust controls are built. Without reliable, verifiable workload identity, mTLS certificates cannot be trusted, authorization policies cannot be enforced, and audit logs lack the attribution necessary for forensic analysis. Investing in a robust workload identity infrastructure pays dividends across every aspect of the Zero Trust architecture.