The Shadow-IT Reality of Generative AI in Engineering Teams
Network-level blocking fails because the engineer's incentive to ship outweighs the security team's incentive to gate. This piece examines the actual data path of an LLM paste, the categories of leak it creates, and the architectural property a containment tool needs to have to make the trade survivable.
Every engineering organisation of any size has, in 2026, three categories of LLM user: the ones who use a sanctioned account, the ones who use a personal account, and the ones who do both depending on what they are pasting. The third group is the largest. The security team knows. The CISO knows. Knowing is not the same as solving.
The instinct, when faced with engineers routinely pasting production code, schema definitions, customer data, and the occasional API key into a public LLM frontend, is to block. Block the domain at the firewall. Block the extension. Issue a policy. Discipline the violators. This piece is not about whether that instinct is correct. It is about why it does not work, what the actual data path looks like, and what architectural property a containment tool needs to have to make the trade between developer velocity and data risk something other than a binary loss.
1. The developer efficiency trap
Network-level LLM blocking fails for the same structural reason corporate web filters always have: the engineer's tolerance for friction is calibrated against the cost of not shipping. A query that would take twenty minutes to answer from documentation takes ninety seconds in Claude. Multiply by the number of times per day an engineer hits something they don't know, and the cost of route-around becomes trivially recoverable.
The route-around taxonomy is well-documented at this point. Personal laptops on the corporate guest network. Mobile hotspots. SSH-tunnelled SOCKS proxies. Personal devices entirely. Browser extensions that wrap LLM access in some sanctioned-looking interface. Pull the policy and ask any senior engineer in confidence which they have used in the last quarter; the answer is "at least two".
This is not a failure of policy. It is a structural property of the labour market. Engineers compete for jobs partially on output velocity. Output velocity in 2026 is partially mediated by LLM access. Refusing to provide that access at work guarantees that engineers will provide it for themselves, in ways the security team cannot observe.
The conclusion most teams arrive at — eventually — is that the policy needs to assume LLM use will happen and aim at containing where the resulting data lives, rather than preventing the use from occurring. This is the same shift that happened with cloud storage in 2014, when "block Dropbox" became "sanction Box" because the alternative was Dropbox-on-personal-devices forever.
2. The anatomy of a leak
Pretend you are reading this from the security team's perspective. An engineer pastes a function into Claude to debug a race condition. The function references three internal services by name, instantiates a database client with a connection string template, and includes a comment that names the customer the bug was reported against. The paste happens. What just left the building?
The immediate leak surface is obvious: the function body, the service names, the connection string template, the customer name. That data is now in Anthropic's storage under the engineer's account, subject to Anthropic's data retention and access policies, available to anyone who compromises that account, and — depending on the account tier — potentially eligible for use in training future model versions unless an opt-out is configured.
The less-obvious leak surface is broader:
- The system clipboard, where the paste content sits until the engineer copies something else — minutes or hours later. Any background process with clipboard read access (and there are many on a typical engineering laptop) sees it.
- The browser's autocomplete and form-history stores, where partial pastes are retained as suggestions for future form fields.
- Any installed browser extension with broad
host_permissions— including, critically, extensions that claim to enhance LLM workflows but in fact read every page. These extensions are the highest-risk category on this list because engineers grant them permission specifically because they want them inside the LLM frontend. - Any cloud-synced clipboard manager. AnyClip, Paste, 1Clipboard, the macOS Universal Clipboard. Once the data is on the clipboard it is in the sync mesh.
- Screenshot tools that auto-OCR. CleanShot X, Shottr, Apple's built-in. The engineer's screenshot of "here's the chat where Claude fixed it" includes the function body as searchable text.
- Terminal scrollback, if any of the paste happened through a terminal. Tmux session state. iTerm2 history. Shell history files.
- Slack, when the engineer pastes the Claude conversation summary to a teammate. Notion, when they paste it into a "things I learned this week" page. Linear, when they paste it into a ticket comment. All three of those are SaaS data processors with their own retention and breach surfaces.
- The third-party "AI productivity tool" that mirrors Claude conversations to a cloud workspace. This is the category the security team is least prepared for: a tool sold as a productivity enhancer that, by design, exfiltrates the entire conversation corpus to a vendor's database.
The compounding observation is that a single paste expands into ten to fifteen secondary copies within minutes. Each secondary copy lives in a separate vendor's storage, under a separate access-control regime, subject to a separate breach probability. The single auth token paste that became a security incident in $YEAR_AGO at $WELL_KNOWN_COMPANY was discovered not in the LLM frontend but in a Slack message a teammate pasted into a public channel. That is the normal shape of an LLM exfiltration incident.
The other compounding observation is that engineers know this. They will go to substantial lengths to avoid Slack-pasting an LLM output that contains a customer name. They will go to far less effort to avoid the cloud-synced clipboard manager and the AI productivity tool, because those happen automatically.
3. Local-first containment
The architectural shape that follows from the above is narrow. A containment tool for shadow-IT LLM use needs to do exactly two things and decline to do the third.
It needs to capture the conversation at the source — at the LLM frontend, before the engineer reaches for the clipboard. The capture surface is the only point in the data path where the conversation exists as a single, addressable object. Once the engineer copies a portion to the clipboard, the capture object splinters into the secondary copy surface described above.
It needs to keep the captured artefact on the engineer's own device, in a format the engineer (and the engineer's security review process) can audit, search, and version. The artefact has to be plaintext for the engineer to use it; the artefact's index can be encrypted to resist passive read of the local filesystem.
It needs to not be a vendor in the data path itself. This is the property most "AI productivity tools" silently violate. A browser extension that captures the LLM conversation and uploads it to a vendor's cloud database to make it "searchable across the team" has solved the secondary-copy problem by becoming the secondary copy. The CISO who sanctions that extension has not reduced the threat surface — they have replaced one vendor (the LLM provider) with two (LLM provider + extension vendor), with the second vendor now holding the full conversation corpus indexed for keyword search. That is not a containment tool; it is a managed exfiltration channel.
The architectural test for a containment tool is, then, a manifest-level test:
- Does the tool request
host_permissionsbeyond the LLM frontend itself? If yes, it can read every page the engineer visits. Not a containment tool. - Does the tool make any outbound network call other than the one to the LLM frontend the engineer was already using? If yes, the conversation is leaving the device through a second channel. Not a containment tool.
- Does the tool ship a native messaging host or other native binary that runs outside the browser sandbox? If yes, the trust surface is much larger than a browser extension and needs its own threat model. Defer.
- Does the tool's privacy policy promise things the manifest cannot deliver? If yes, the engineer's CISO should treat the policy as marketing and the manifest as ground truth.
Three of those four questions are answerable by reading the extension's manifest.json. The fourth is answerable by reading the network panel of Chrome DevTools while the extension is active. This is the property the security team needs: the architectural claim is independently verifiable in less than ten minutes by anyone with Chrome and a text editor.
A containment tool whose security argument requires you to trust the vendor has not contained anything. It has rebranded the trust.
This piece is not specifically about Cairn. The constraints above describe a category of tool, not a product, and there is more than one way to build inside them. We have built Cairn this way because the constraints describe the only architecture we think makes the developer-velocity-versus-data-risk trade survivable at the team scale. If a security-team reader of this piece evaluates Cairn against the four questions and finds a divergence between the manifest and the claim, the right action is to email trace@blacktrace.co with the specific divergence and we will either fix the product or correct the disclosure. We have a published security & architecture disclosure and a published privacy policy, both of which are designed to be cross-checkable against the actual manifest.
The broader category of tool is what the industry needs. The shadow-IT problem with LLMs is not going away, and policy alone will not solve it. The architectural answer is local-first capture without a vendor in the data path. Anything else just shifts the leak.
∞