On the latest All In episode, Chamath dropped a line that got my attention: “On-prem is the new cloud.”
His argument is straightforward. When enterprises use cloud-based AI tools, their prompts, strategy documents, and sensitive data flow through third-party infrastructure. He referenced a ruling where a judge confirmed that data shared in cloud environments can lose attorney-client privilege. The implication: if your AI queries touch public endpoints, your data isn’t really yours anymore.
The other hosts piled on. Jason shared that his AI agents run $300/day in token costs — roughly $100K/year per agent. David Sacks pointed to the tension between needing AI to stay competitive and the security risk of using it.
Their conclusion? Enterprises will be forced back to on-prem AI infrastructure. Pay more, but keep control.
I work with companies solving this exact problem every day. And I think the All In guys are half right — the problem is real, but the answer isn’t necessarily racking servers in your data center again.
The Problem Is Real
Let’s be specific about what’s at risk.
When you send a prompt to a public AI API, you’re transmitting:
- The prompt itself (which often contains proprietary context)
- Documents you attach for analysis
- The response, which may be logged or used for model improvement
For regulated industries — finance, healthcare, legal, defense — this is a non-starter. But even for tech companies, sending your product roadmap or M&A strategy through a third-party API should make your CISO uncomfortable.
Chamath’s right that this creates a paradox: you need AI to compete, but using it naively compromises your data.
”On-Prem” Isn’t the Only Answer
The All In crew framed this as a binary: public cloud AI (risky) vs. on-prem (expensive but secure). But there’s a middle path that most enterprises should consider first.
Azure OpenAI: Your Data Stays Yours
Azure AI Foundry is fundamentally different from hitting public AI APIs directly:
- Your prompts and completions are never used for model training. Full stop. Microsoft’s data processing terms are explicit about this.
- Your data never leaves your tenant. It doesn’t get shared with model providers or other customers.
- It’s not just OpenAI. The model catalog includes GPT-4, GPT-4o, Claude (Anthropic), Kimi K2.5 (Moonshot AI), DeepSeek, Meta Llama, Cohere, and hundreds more. Pick the best model for the job — they all deploy behind the same security boundary.
- Compliance built in. SOC 2, HIPAA, GDPR, FedRAMP — the certifications enterprises actually need.
This alone solves the core concern Chamath raised. You get the full frontier model landscape without the data leakage risk.
Private Endpoints: Network-Level Isolation
For enterprises that need more than contractual guarantees, Azure Private Link lets you:
- Assign a private IP from your VNet to your Azure OpenAI resource
- Route all traffic over the Microsoft backbone — never touching the public internet
- Disable public access entirely
- Connect from on-prem via VPN or ExpressRoute
Your AI traffic looks like any other internal service call. Same network security model you already have for databases and storage.
Confidential Computing: Hardware-Enforced Protection
This is the piece most people don’t know about yet.
Azure confidential computing uses Trusted Execution Environments (TEEs) — hardware-level encryption that protects data while it’s being processed. Not just at rest, not just in transit. During computation.
The latest DCasv6 and ECasv6 VMs use AMD SEV-SNP for hardware-enforced memory isolation. Even Microsoft operators can’t see what’s running inside the enclave.
For AI workloads, this means:
- Model inference runs in a hardware-isolated environment
- Prompt data is encrypted in memory during processing
- Attestation proves the environment hasn’t been tampered with
This is as close to on-prem security as you can get — without buying a single server.
The Architecture Stack
Here’s what a privacy-first enterprise AI deployment on Azure actually looks like:
- Azure OpenAI Service — managed models with tenant-level data isolation
- Private Endpoints — VNet integration, no public internet exposure
- Confidential VMs — hardware TEEs for data-in-use protection
- Azure AI Foundry — orchestration layer for RAG, agents, and custom workflows
- Azure AD + RBAC — identity-based access control, same as everything else
- Content Safety — built-in filters for compliance and governance
Total setup time for a proof of concept: days, not months. And you’re running the same models you’d run on-prem, with the same security guarantees, at cloud scale.
What About Cost?
The All In guys raised a real concern here too. Token costs are climbing. Jason’s $100K/year-per-agent number is real for heavy agentic workloads.
But on-prem isn’t cheaper. Running your own GPU cluster means:
- Capital expenditure for hardware (H100s aren’t cheap)
- Hiring ML ops engineers to maintain it
- Handling model updates, scaling, and redundancy yourself
- Losing access to the latest models until you can deploy them locally
Azure’s pricing model — pay per token, scale to zero when idle — is still more cost-effective for most enterprises than dedicated infrastructure. And Provisioned Throughput Units (PTUs) give you reserved capacity at predictable pricing for production workloads.
The real cost optimization isn’t cloud vs. on-prem. It’s designing your AI architecture to minimize unnecessary token consumption — caching, prompt engineering, model routing (use a smaller model when you can, reserve the big ones for when you need them).
The Bottom Line
The data privacy problem the All In hosts identified is legitimate. Enterprises shouldn’t be sending sensitive data through public AI endpoints without understanding the implications.
But the solution isn’t a wholesale retreat to on-prem. It’s using a cloud platform that was built for enterprise security from the ground up — with tenant isolation, private networking, and hardware-level confidential computing.
The companies I work with are already doing this. The architecture exists today. The question isn’t whether you can run AI securely in the cloud. It’s whether your team knows how to set it up.
Inspired by the AI discussion on All In Podcast Episode 261. I’m a Senior Solutions Engineer at Microsoft focused on AI workloads — these opinions are my own.