All Articles
GCP
IAM
Cloud Security
DevSecOps
Startup Engineering

GCP Workload Identity Federation: How Startups Kill Static Keys

Static GCP service account keys are the credential most likely to leak your project. Workload Identity Federation removes them for GKE, CI/CD, AWS, and Azure. How it works in 2026.

Avinash S
June 4, 2026
12 min read
GCP Workload Identity Federation: How Startups Kill Static Keys

Most guides on Google Cloud service accounts still tell you to download the JSON key, drop it in a secret manager, and rotate it every 90 days. That advice is a decade old and it is now actively wrong. In 2026 the correct number of static service account keys in a startup GCP project is zero.

This post is for founders and engineers running pre-seed and seed startups on Google Cloud who still have at least one credentials.json sitting in a CI variable, a developer laptop, or a Kubernetes Secret. It covers what a service account key actually is, why it is the single credential most likely to leak your entire project, and how Workload Identity Federation removes the need for it across GKE, CI/CD pipelines, and multicloud workloads.

What generic articles get wrong: they treat key rotation as the goal. Rotation is damage control for a credential that should not exist. The real goal is to never hold a long-lived key at all, so there is nothing to leak, rotate, or revoke under pressure at 2 AM. Google has been steering customers this way since 2023, and for new organizations the platform now blocks key creation by default.

The state of GCP service account keys in 2026

Google Cloud now disables service account key creation by default for new customers. If your organization was created on or after May 3, 2024, the organization policy constraint iam.disableServiceAccountKeyCreation is enforced from day one, and any attempt to create a key fails with FAILED_PRECONDITION: Key creation is not allowed on this service account. The behaviour is documented in the organization policy reference for restricting service accounts.

This is not a soft suggestion buried in a best-practices PDF. It is the platform default. Google's own best-practices documentation states plainly that Workload Identity Federation is the preferred way to configure identities for external workloads, because it relies on short-lived credentials instead of long-lived secrets. If you are still building around downloaded keys, you are swimming against the direction the platform is moving.

1. What a service account key actually is, and why it is dangerous

A Google Cloud service account key is an RSA private key wrapped in a JSON file. It does not expire. There is no second factor on it, no IP restriction by default, no session length. Anyone who holds the file can authenticate as that service account from any machine on earth and act with its full set of permissions until a human notices and deletes the key.

Compare that to a user password, which at least sits behind multi-factor authentication and conditional access. A service account key has none of that. It is a bearer credential: possession equals identity.

The leak paths are mundane and constant. Keys get committed to git history, printed into CI logs, baked into container images, copied into Slack, left on a stolen laptop, or pasted into a third-party tool during a debugging session. GitHub secret scanning catches some, but only after the key is already public. The blast radius is whatever the service account can do, which at a pre-seed startup is almost always more than it should be, because nobody scoped it down when they were shipping the MVP.

Takeaway: treat any service account JSON key on disk as already compromised. The question is not whether it leaks, but when, and how much it can touch when it does.

2. How Workload Identity Federation actually works

Workload Identity Federation removes the key by removing the need to prove identity with a secret you store. Instead, it trusts an identity the workload already has from an external issuer.

The model has three parts. First, you create a workload identity pool that represents a set of external identities. Second, you add a provider to that pool that trusts a specific issuer: GitHub's OIDC endpoint, an AWS account, an Azure tenant, or any provider that speaks OpenID Connect or SAML 2.0. Third, at runtime the workload presents its native token to Google's Security Token Service, which validates the token against the pool's attribute mapping and conditions, then hands back a short-lived federated access token. The Workload Identity Federation documentation lists the supported sources: AWS, Azure, on-premises Active Directory, GitHub, GitLab, workloads using X.509 client certificates, and any OIDC or SAML 2.0 identity provider.

The federated credentials are short-lived. By default the access token expires one hour after it is created, which sharply limits how long a stolen token is useful. Because the trust lives in configuration rather than in a file, there is no secret to rotate or store after the initial setup.

Takeaway: the security win is structural. You are not protecting a key better. You are deleting the key and proving identity with a token that expires before most attackers can act on it.

3. Workload Identity Federation for GKE: the most common startup case

If you run Google Kubernetes Engine, this is where you start, because GKE is where most startups accidentally store keys as Kubernetes Secrets. Workload Identity Federation for GKE lets each pod authenticate as its Kubernetes service account, with no JSON key ever entering the cluster.

You enable the feature on the cluster and on each node pool. The GKE metadata server, which runs as a DaemonSet on every node per the GKE Workload Identity concepts page, intercepts the pod's credential request and performs the token exchange transparently.

There are two modes. In the older impersonation mode, you annotate the Kubernetes service account with iam.gke.io/gcp-service-account pointing at a Google service account, and you grant that Google service account the roles/iam.workloadIdentityUser role bound to the Kubernetes identity. In the newer direct-access mode, you address the Kubernetes service account directly as an IAM principal, which removes the intermediate Google service account and its extra bindings entirely. The how-to guide walks both paths.

Takeaway: enable Workload Identity Federation on the cluster and node pools, map your Kubernetes service accounts to permissions, then delete every Kubernetes Secret that holds a service account key. A pod that needs BigQuery should get there through its identity, not through a mounted file.

4. Keyless CI/CD with GitHub Actions and GitLab

CI/CD is the most common place a startup leaks a key, because a deploy pipeline needs broad permissions and the path of least resistance is to paste a JSON key into a repository secret. Workload Identity Federation kills that pattern.

How does your infrastructure stack up?

Take the 2-min security quiz →

GitHub Actions can mint an OIDC token from the issuer https://token.actions.githubusercontent.com that uniquely identifies the repository, workflow, branch, and environment. You configure a workload identity pool provider to trust that issuer, set an attribute condition that pins access to your specific repository, and use the google-github-actions/auth action in the workflow. GitHub's own OIDC configuration guide and Google's keyless authentication announcement both cover the setup end to end.

The token lifetimes are tight: the GitHub OIDC token lives roughly five minutes, and the derived Google credential expires within the hour. The one mistake to avoid is leaving the attribute condition too loose. If you trust the issuer without pinning the repository, any GitHub repository in the world can request your identity. Pin it to your org and repo, and ideally restrict by branch or environment for production deploys.

Takeaway: delete the service account key secret from your CI configuration today. It is usually the single highest-value secret a startup stores, because it can deploy.

5. Authenticating AWS and Azure workloads to GCP

Plenty of startups are not single-cloud by choice. A Lambda function writes to BigQuery, an Azure function calls a Vertex AI endpoint, an on-premises job pushes data to Cloud Storage. The old answer was to courier a GCP service account key into the other cloud's secret store. Workload Identity Federation removes the courier.

For AWS, the workload uses its existing IAM role. The federation flow validates a signed AWS GetCallerIdentity request as proof of the role, and you restrict the pool to a specific AWS account and role ARN. For Azure, the workload presents the token from its managed identity, and you restrict by tenant and object ID. The deployment-pipelines guide documents these attribute conditions.

No GCP key crosses the cloud boundary in either direction. The AWS or Azure workload keeps using the credential its own platform already manages, and GCP trusts that credential through configuration.

Takeaway: cross-cloud access does not require a key to travel between providers. Map the foreign identity into a pool and scope it tightly to the exact role or managed identity that needs access.

6. Lock the door with organization policy

Migrating your workloads is necessary but not sufficient. An engineer under deadline pressure can create a fresh key in thirty seconds and undo the whole effort. You close that door with organization policy.

Enforce iam.disableServiceAccountKeyCreation at the organization or folder level. Organizations created on or after May 3, 2024 have it enforced already; older organizations must set it explicitly. Pair it with iam.disableServiceAccountKeyUpload so nobody re-introduces an externally generated key. Google also offers a newer managed constraint, iam.managed.disableServiceAccountKeyCreation, which supports conditions and dry-run mode for a staged rollout. Both are covered in the disable and enable service account keys documentation.

Set the policy at the highest scope you can, then grant narrow exceptions on the rare project that genuinely needs a key for a legacy integration. Exceptions should be the documented anomaly, not the default.

Takeaway: the migration is not finished until policy makes regression impossible. A keyless project that allows new keys is one rushed pull request away from being a key project again.

7. Finding and killing the keys you already have

You cannot delete what you cannot see, so the migration starts with an inventory, not a deletion. List the keys on every service account with gcloud iam service-accounts keys list, and filter for USER_MANAGED keys. Ignore the SYSTEM_MANAGED keys: those are the ones Google creates and rotates for you, and they are fine.

Before you delete anything, check whether each key is still in use. Policy Analyzer and the service account authentication logs expose the last authentication time for a key. A key that has not authenticated in 90 days is almost certainly safe to remove. A key that authenticated an hour ago is load-bearing, and you need to find the workload first.

Then disable before you delete. Disabling a key is reversible; deletion is not. Disable the key, watch for breakage for a week, and only then delete it. Work in order of blast radius: kill CI keys first, then GKE Secrets, then human-developer keys, which you replace with gcloud auth login and Application Default Credentials so engineers stop carrying personal copies.

Takeaway: disable then delete, never delete blind. The goal is a clean cutover, not a self-inflicted outage that makes the security team look reckless.

8. Common failure modes and how to debug them

Almost every Workload Identity Federation failure traces back to an attribute-condition mismatch or a missing IAM binding, not to a platform bug. The error messages point at the cause if you read them in that frame.

A token exchange that returns permission denied usually means the incoming token's claims do not satisfy the provider's attribute condition: the repository, branch, role ARN, or audience does not match what you mapped. A GKE pod that cannot authenticate usually means Workload Identity Federation is not enabled on the node pool, the Kubernetes service account annotation has a typo, or the roles/iam.workloadIdentityUser binding is missing. A GitHub Actions workflow that works on the main branch but fails on pull requests usually means the attribute condition is pinned to a single branch. And the error Key creation is not allowed on this service account is not a bug at all: it is the organization policy from section 6 doing its job. Do not disable the policy to make the error go away. Fix the workload to use federation instead.

Takeaway: when federation fails, read the rejected token's claims and compare them to your attribute conditions line by line. The mismatch is nearly always there.

Summary table

WorkloadOld (key) patternKeyless patternKey control to set
GKE podsJSON key in a Kubernetes SecretWIF for GKE, KSA mapped to IAMDelete the Secret after cutover
GitHub ActionsKey in a repo secretOIDC token to a scoped poolPin attribute condition to repo
GitLab CIKey in a CI variableOIDC token to a scoped poolPin to project and ref
AWS workloadGCP key in AWS Secrets ManagerIAM role via GetCallerIdentityRestrict to account and role ARN
Azure workloadGCP key in Key VaultManaged identity tokenRestrict to tenant and object ID
Developer laptopPersonal JSON keygcloud auth login plus ADCDelete the user-managed key
Whole orgKeys allowed by defaultFederation everywhereEnforce key-creation org policy

What to do at each stage

Pre-seed (under 10 engineers, one GKE cluster, a single CI pipeline): this is one focused day of work. Enable Workload Identity Federation on the cluster, move GitHub Actions to OIDC, delete the keys you find, and turn on the iam.disableServiceAccountKeyCreation org policy. You are small enough that there is no legacy integration to babysit. Do it before you have ten more services.

Seed (10 to 30 engineers, multiple environments, maybe a second cloud): add the multicloud federation for any AWS or Azure workloads, run a Policy Analyzer pass to find keys that survived the first sweep, and pin per-environment attribute conditions so a staging pipeline cannot deploy to production. This is the stage where a forgotten key in a side project becomes the breach.

Series A and beyond: move to the managed constraint with dry-run mode so you can stage policy changes across many projects without breaking a team, enforce conditions per environment, and audit federation configuration the same way you audit IAM roles. At this size the risk is not a single leaked key, it is configuration drift across dozens of projects.

The honest bottom line

Workload Identity Federation is not a nice-to-have for 2026. It is the default the platform now ships, and the keyed alternative is a credential that cannot be made safe, only watched. For a pre-seed startup the entire migration is roughly a day of work, and it removes the single most dangerous credential class you own. That is one of the best security returns on a day you will find anywhere in your cloud setup.

If you want a second set of eyes on your specific GCP setup, I run a free 20-minute cloud audit for founders. Your workloads, your CI, your IAM, and an honest read on where the keys are hiding and what it takes to remove them. Send a note.

Avinash S is the founder of MatrixGard. Fractional DevSecOps for pre-seed and seed startups across India, the GCC, the UK, and the US. Almost a decade of building, breaking, and securing cloud infrastructure on AWS and GCP.


Methodology note. Technical claims reference public Google Cloud IAM, GKE, and Organization Policy documentation, the GitHub Actions OIDC documentation, and the google-github-actions/auth project, all current as of June 2026. Default-enforcement dates and token lifetimes are taken from Google's published documentation; verify them against your own organization's policy state before you act, since defaults differ by organization creation date. Operational sequencing advice (disable before delete, order of migration) is practitioner opinion grounded in production experience.

MatrixGard

Ready to close the gaps?

MatrixGard finds what your team missed. Not because they're bad, because they're too close to the problem.

Book a free review