Allow‑lists that restrict outbound traffic to specific domains are a common sandbox defense, but they cannot stop data from leaking through those permitted channels. André Graf explains how malicious scripts can encode secrets in DNS queries or HTTP requests to allowed endpoints, and proposes a layered DLP proxy that inspects, decodes, and filters outbound traffic to close this gap.
The Core Argument
A sandbox that enforces a domain allow‑list does stop arbitrary connections, but it does not stop an attacker from slipping secrets out through the very connections you have explicitly permitted. The blind spot is not a flaw in the sandbox itself; it is a limitation of any network‑level policy that reasons only about where traffic goes, not what it carries.
Illustrative Scenarios
- DNS‑based exfiltration – A malicious post‑install script reads
~/.aws/credentials, base‑64‑encodes the content, and issues a DNS lookup such asYWJjZGVmZ2hpamtsbW5vcHFyc3R1di5ldmlsLmV4YW1wbGUuY29t. The sandbox’s policy allows DNS, so the request passes, but the sub‑domain itself is the payload. - Authorized HTTP endpoint abuse – A build step posts logs to
https://allowed‑analytics.example.com/log. By embedding a private SSH key in the request body (often after base‑64 encoding), the script leaks the key to a domain that the policy explicitly trusts.
Both examples demonstrate that authorization of the destination is insufficient; the content of the request can be weaponized.
Recent Supply‑Chain Incidents
- Shai‑Hulud worm (Nov 2025) – Compromised npm packages executed pre‑install code that harvested GitHub, npm, AWS, and SSH credentials, then exfiltrated them to attacker‑controlled repos.
- LiteLLM vulnerabilities – A series of Python‑ecosystem bugs (SQL injection, auth bypass, SSTI) were chained to steal credentials from build environments.
These events show that attackers are already exploiting the very loophole that allow‑lists leave open: using legitimate outbound channels as covert data‑exfiltration pipes.
The Proposed Remedy: An L7 DLP Proxy
Graf’s solution is to route every outbound TCP connection from the sandbox through a local HTTPS proxy that performs deep inspection before forwarding traffic upstream. The proxy is enforced by a seccomp supervisor that only permits connections to 127.0.0.1:8080.
Processing Pipeline
- Policy check – Verify the target domain against the allow‑list.
- DNS entropy analysis – Compute Shannon entropy for each DNS label; flag high‑entropy sub‑domains that resemble encoded secrets.
- Header scan – Inspect all request headers for embedded tokens.
- Body scan – Buffer the request body (size‑limited), recursively decode base64, hex, percent‑encoding, JSON escapes, HTML entities, and decompress gzip/deflate/brotli/zstd. Scan the resulting plaintext for known secret patterns.
- Forward upstream – If the request passes all checks, forward it to the intended service.
- Response scan – Apply the same inspection to inbound responses, catching accidental credential leaks.
A single decision point (allow or block) is produced, optionally logging a warning (X‑Canister‑DLP‑Warning) when operating in monitor mode.
Detection Capabilities
| Category | Example Patterns |
|---|---|
| Cloud provider keys | AKIA…, GOOG… |
| VCS tokens | GitHub ghp_…, GitLab glpat‑… |
| Registry credentials | npm //registry.npmjs.org/:_authToken=… |
| API keys | OpenAI, Stripe, Slack |
| Database URIs | postgres://user:pass@host/db |
| SSH private keys | PEM‑encoded RSA/Ed25519/ECDSA |
| High‑entropy strings | JWTs, generic bearer tokens |
| Canary tokens | Pre‑seeded fake secrets for verification |
Detectors are driven from a single registry, making additions a matter of inserting one entry rather than updating multiple lists.
Evasion‑Resistance Techniques
- Recursive decoding – Up to 32 layers of base64, hex, percent‑encoding, JSON
\uXXXX, and HTML entities are peeled back. - Compression handling – Automatic decompression based on
Content‑Encodingor magic‑byte sniffing. - Chunked transfer overlap – A 256‑byte sliding window bridges chunk boundaries, ensuring secrets split across chunks are still detected.
- DNS entropy budgeting – Tracks cumulative high‑entropy bytes per sandbox session; a single anomalous query is tolerated, but a flood triggers a block.
- Fragment‑aware parsing – Extracts values from JSON, XML, multipart forms, etc., before scanning.
What the DLP Layer Actually Provides
- Second‑line defense – It does not replace code review or dependency vetting; it catches leaks that slip past those primary controls.
- Content‑aware enforcement – Unlike a pure allow‑list, it asks what is being sent, not just where.
- Redaction – Matched secrets are redacted in logs (
ghp_•••••a3f2b7c9), preventing the proxy from becoming a new source of leakage.
Limitations
- Pattern‑matching can never be exhaustive; novel encodings may evade detection.
- High‑entropy thresholds inevitably balance false positives against false negatives.
- The system is still under active development; the detector registry and thresholds evolve as new evasion tricks appear.
Design Takeaways for Any Sandbox
- Decode before you scan, and do it recursively.
- Maintain overlap windows across packet or chunk boundaries.
- Treat DNS as a potential exfiltration vector, not merely a resolver.
- Budget entropy per session to differentiate noise from a deliberate data‑drain.
- Centralise detector definitions in a single source of truth.
- Redact secrets everywhere they appear in logs.
Closing Thoughts
Network allow‑lists are a valuable first barrier, but they leave a critical gap: they cannot see what traverses the allowed channels. By inserting an L7 DLP proxy that inspects, decodes, and filters outbound traffic, you add a complementary line of defense that addresses the content‑level threat. The implementation described here—Canister, a lightweight Rust sandbox with an integrated DLP proxy—demonstrates a practical path forward, while openly acknowledging that the battle against covert exfiltration is ongoing and requires continual refinement.
For the full technical details, see the Canister repository and the accompanying DLP design document.
Comments
Please log in or register to join the discussion