Build Cache Poisoning via Untrusted Pull Requests

A critical security flaw exists in self-hosted, bucket-based remote-cache systems:

How Bucket-based Remote Cache Systems Work

A typical remote-cache flow using storage service includes:

  1. Artifact construction (via bundler, compiler, etc.)
  2. Artifact packaging (Nx or similar tool)
  3. Encryption & hashing of the packaged artifact
  4. Uploading the encrypted artifact to storage (transit)
  5. Storing artifacts until needed (at rest)
  6. Downloading from storage (transit)
  7. Decryption of the packaged artifact (Nx or similar tool)

Cache Poisoning by Construction (CPoC) Vulnerability

The vulnerability exploits a race condition between the main branch and a pull request. When both have identical source files and attempt to build an application and write to the same remote cache slot, whichever completes first becomes the source of truth, allowing untrusted code to poison the cache used by trusted environments.

This vulnerability occurs at step 1 - Artifact construction, before any transit or storage security measures take effect. As poisoning happens during this phase, the malicious data is sent and inserted into the cache using the protected mechanisms.

How CPoC Attacks Work

  1. Step 1: Monitor and Mirror
    The attacker creates an innocent-looking branch from the main branch:
    git fetch origin main
    git checkout main
    git pull
    git checkout -b feature/innocent-looking-update
              
  2. Step 2: Inject Malicious Code
    The attacker modifies the CI script in the PR environment to run the build earlier, then modifies the build step to produce poisoned output:
    # Modified .github/workflows/build.yml
    - name: Patch build tools
      run: ./patch-webpack.sh node_modules/webpack
    - name: Build
      run: npm run build
              
    In this example, the attacker monkey patches webpack to inject a backdoor during compilation. But there are other ways to do it (see below).
  3. Step 3: Race to Cache
    The attacker triggers a PR build. The simplified build pipeline finishes in just a minute, while the legitimate build was still validating its checks so the attacker's poisoned build completes first and writes to the cache.
  4. Step 4: Automatic Deployment
    The main branch build runs. It calculates the source file hash and finds a matching artifact in cache (the attacker's poisoned one). The build system skips building and uses the cached artifact instead. The compromised build artifact gets promoted to production.
  5. Step 5: Cover Tracks
    The attacker erases all evidence, and closes the PR.
    # Remove all evidence
    git push --force origin feature/innocent-looking-update
    
    # Delete the branch entirely
    git push origin --delete feature/innocent-looking-update
          

On Mitigation

Input file hashing doesn't prevent this attack.

Your build process looks like this:

The hash covers the inputs. It doesn't control what happens inside the tool. An attacker transforms this into:

Same inputs. Same hash. Poisoned output. Even hermetic tools rely on system binaries which can be replaced.

But there is another way to do it: Concurrently modifying the build output folder

Most tools write to a staging directory before creating the final artifact. The attacker can run a concurrent process that modifies files after the build but before packaging:

# Build process writes to ./dist
npm run build &

# Simultaneously modify the output
while [ ! -f ./dist/main.js ]; do sleep 0.1; done
echo "malicious_code()" >> ./dist/main.js
  

Traditional Security Measures Are Ineffective

Traditional security measures are designed to protect artifacts during storage or transmission. In this case, the attack happens earlier, during artifact creation.

The problem is the build tool itself. It's a black box, and there's no way to independently verify whether the output is correct or safe. Whatever the tool produces is implicitly trusted. Traditional security models assume a valid artifact that might get compromised later. But here, the artifact is malicious from the start.

Traditional security protections which do not address CPoC attacks:

Trust Boundary Violations

Why It’s Hard / Impossible to Trace

Tracing the origin of compromised artifacts is extremely difficult:

CPoC Attack Severity

Given the attack mechanism described, an attacker could exploit the cache poisoning vulnerability to perform various malicious actions:

  1. Code Execution: The attacker could inject malicious code that executes when the artifact is used in production environments, potentially gaining remote access to internal systems.
  2. Data Exfiltration: The poisoned artifact could contain code that quietly harvests sensitive information and transmits it to external servers controlled by the attacker.
  3. Lateral Movement: Once deployed to production, the compromised artifact could serve as a foothold to explore and penetrate deeper into the organization's infrastructure.
  4. Backdoor Installation: The attack could establish persistent access mechanisms that remain even after the initial vulnerability is discovered.
  5. Supply Chain Compromise: If the artifacts are distributed beyond the organization (like open-source packages), the attack could affect downstream consumers.
  6. Credential Theft: The poisoned artifact might contain code that harvests authentication credentials from configuration files or environment variables.
  7. Ransomware Deployment: In extreme cases, attackers could deploy ransomware through the compromised artifact, encrypting critical systems.
  8. Competitive Sabotage: A malicious actor could introduce subtle bugs or performance degradations that damage the product's reputation over time.
  9. Time-Delayed Attacks: The poisoned artifact might contain dormant code that activates only under specific conditions or after a certain time period, making detection even more difficult.

This vulnerability is particularly dangerous because it operates like a trojan horse - the malicious payload rides inside what appears to be a legitimate build artifact, bypassing normal security controls.