19 May 20265 min readAIPricingAnthropicSelf-hosting

Cheap cloud AI was never going to last

The off-ramp wasn't built for everyone.

On June 1, GitHub Copilot moves to usage-based billing. On June 15, Anthropic caps Claude Code. The flat-fee era is ending in a two-week window. Heavy users have been consuming roughly $8 of compute per $1 of subscription revenue; the labs aren't covering that anymore. The off-ramp – self-hosting open-source models – exists, but charges a skill toll that excludes the user base the labs onboarded. Who gets left where, and was the door ever meant to stay open?

Anthropic just announced self-hosted sandboxes and MCP tunnels for Managed Agents. Enterprise stack getting new product and expanded control the same fortnight the consumer side gets re-costed is the bifurcation we all need to be talking about.

On June 1, GitHub Copilot moves to usage-based billing. On June 15, Anthropic splits programmatic Claude Code off the subscription onto a separate metered credit and the $200 Max plan gets $200 of compute on top. The flat-fee era is ending within a two-week window. Heavy users have been consuming roughly $8 of compute per $1 of subscription revenue; the labs aren't covering that anymore. I'm migrating to self-hosted open source LLM for the workflows that need it. Most people don't have that fallback, and the off ramp for the cloud AI highway isn't accessible to the average user. The questions worth asking: who gets left where, and was the door ever meant to stay open?

The skill toll on the off-ramp

The consumer AI you use has been subsidised. For heavy users, the subscriptions cost less than the compute they consume. The VCs foot the bill. This isn't a secret. It's the product strategy: buy market share now, raise prices once the switching cost is high enough that customers stay. For AI, this cost is dependency on the workflow. The more you've woven the tool into your day, the harder it is to unpick. This is textbook loss-leader pricing – see Uber, Amazon – but still worth appreciating.

The alternative to subsidised consumer AI exists. Open-source models keep getting better. Local AI on laptops and phones is now viable for most everyday tasks. Self-hosting is a real option if you have the skills, the hardware, and the time to maintain a stack.

I do, to a point. We run an open-source model locally – the kind you download once and operate on your own machine, no subscription, no per-message cost – and point it at our email and internal tools for the work that doesn't need a frontier model. It keeps our cloud bill manageable and means we're not locked to one provider. Most people don't, and don't want to. Standing up a local model (Meta's Llama, Alibaba's Qwen, or similar) and getting it useful is a weekend's work for someone with a developer background, an indefinite barrier for those without, and is somewhere in the middle if you're feeling brave.

Which means the off-ramp has a skill toll that excludes the user base the labs have been onboarding. The cheap AI phase taught our workforce that AI is the inescapable future of work. The re-costed AI phase will hurt businesses who thought they had discovered the productivity jackpot, and will leave individuals on the wrong side of a price they can no longer afford but can't circumvent.

Who the new prices price out

Two groups end up where the prices want them. Enterprise users keep paying because the labs will price the enterprise tier at exactly what corporate procurement will accept — and increasingly build it for whatever corporate security will accept. Wealthy individual users keep paying because the consumer tier will land at a number that's a footnote in their monthly ledger.

Anthropic shipped a Managed Agents update today — execution in the customer's own VPC (Virtual Private Cloud, an isolated private network inside a cloud platform), internal tools reachable without exposing them to the public internet. This is a commodification of control that the consumer camp won't see. The enterprise stack is getting product, while the consumer side is getting bills.

It's the middle that get squeezed. People for whom AI tools have become productive but who can't absorb a 5x or 10x price increase. The independent contractor whose $20 subscription runs compute worth multiples of that, who rebuilt their client workflow around a tool that will now cost them their margin. Small business owners. Students who built study workflows around the current free tier. Vibe coders running businesses on top of cloud AI wrappers, because they were told that's how you survive now. The people who, if you took AI away tomorrow, would feel it as a productivity loss they can't backfill.

That group's only options are to pay the new price, drop back to a free tier that will become more constrained than today's, if not obsolete, or learn to self-host. The first they can't afford, the second is degraded by design, and the third has a toll many can't pay. The off-ramp doesn't accommodate them because it was never built to. Access to AI that actually works is about to become a wealth marker, and the people it helped most are the first ones priced out.

What the labs are betting on

Cancelling won't be cancelling a tool, it'll be fundamentally reorganising how you work. For now, most people won't, especially when the subscription is priced just under the pain threshold.

OpenAI's Nick Turley, Vice President and Head of ChatGPT, has called the original pricing something they "stumbled into," comparing unlimited plans to "unlimited electricity." The flat-fee era was always going to end. This two-week window in June is when that starts.

The subscription structure tells you it's deliberate. The higher-capacity plans are monthly-only. No annual option, no contractual commitment, no refund obligation when limits change. If you don't like the new terms, unsubscribe. No dispute, no prorated return, no fighting with an AI customer service chatbot. It's cleaner than a terms-of-service amendment and harder to challenge. The flexibility is sold as a feature. It's an accountability release valve.

It's defensible, maybe even reasonable. What's worth acknowledging is that it's also systematic. Anyone budgeting current AI spend as a long-run line item is reading the offer wrong.

So

Those with a head start on self-hosted setups are still waiting to see what happens to equipment costs and model performance – it isn't a fix-all. The piece you're reading was proofread by Claude on a subscription that costs me $100 a month and runs on compute worth more than that. This is the use case the loss-leader is built around, and I know it.

We've been building toward a viable open-source alternative at our studio for months. Not for ideological reasons. Because any business building workflows on a single provider's AI, at prices that provider has told you are temporary, without a credible fallback, is exposed. That's procurement hygiene, not paranoia. Most businesses aren't doing it. Anyone who built on the Twitter API before 2023, or Heroku before they killed the free tier, knows how this looks: the vendor's pricing power becomes your problem the moment their incentives stop aligning with yours. The risk was obvious then too, the mitigation was available, and the adoption lagged until the first wave of pain forced it.

The productivity tools that changed how we work are about to see a correction. Will your skill set cover the toll?

If you've found a way to make the off-ramp work for non-technical users, I'd like to hear it: hello@greatworkeveryone.com.