
Investors should prioritize NVIDIA (NVDA) and hardware providers as the shift toward "Agentic AI" creates a massive, sustained demand for chips that is expected to outpace supply for the next five years. Microsoft (MSFT) is a high-conviction play as it leads the industry transition from flat-fee subscriptions to high-margin, usage-based billing through GitHub Copilot. Conversely, Anthropic currently presents a higher platform risk for businesses due to significant compute constraints and reliability issues compared to OpenAI. Enterprise users should audit their AI spending before June 1st to prepare for price hikes of up to 6x on frontier models like Claude Opus and GPT-5.3. To optimize costs, companies should implement a "Model Portfolio" strategy, substituting expensive frontier models with cheaper, efficient alternatives like Alibaba’s Qwen for routine tasks.
The "Millennial Lifestyle Subsidy" (cheap Uber/DoorDash rides funded by VC) has arrived for AI. Companies are moving away from flat-fee "unlimited" plans toward usage-based billing as the cost of serving high-token agentic workflows becomes unsustainable.
• Prepare for Price Hikes: Expect a transition from $20/month flat fees to consumption-based models (credits/tokens). • Unit Economics Shift: Businesses must re-evaluate AI integration strategies as "inference costs" begin to rival human salary costs in some engineering sectors. • Efficiency over Raw Power: The market is shifting focus from "who has the best model" to "who can deliver inference most efficiently."
GitHub (owned by Microsoft) recently announced a major shift to consumption-based fees for its Copilot platform. The previous $39/month top-tier subscription is being replaced by a credit system to account for the high compute demands of "agentic" coding.
• Significant Price Increases: New multipliers for specific models represent a roughly 6x price hike for frontier coding models. • Model Multipliers: Under the new June 1st pricing, Claude Opus 4.7 usage costs jumped from a 7.5x multiplier to 27x. Gemini 1.5 Pro and GPT-5.3 Codex moved from 1x to 6x. • Sustainability: Microsoft is signaling that the era of absorbing massive inference costs to gain market share is ending.
• Enterprise Budgeting: Companies using GitHub Copilot should audit their usage before the June 1st switch to avoid "bill shock." • Vendor Lock-in Risk: High multipliers for third-party models (like Anthropic's Claude) within the GitHub ecosystem may force users to stick to native Microsoft/OpenAI models to save costs.
Anthropic is currently facing "stability issues" and "compute constraints," leading to a perceived "vibe shift" where they are losing ground to OpenAI in the developer market.
• Compute Shortage: Unlike OpenAI, Anthropic appears to have underestimated its compute needs, leading to frequent outages and the metering of supply during peak hours. • Product Restrictions: The company has withheld its largest models and experimented with removing Claude Code from Pro subscriptions to manage load. • Usage-Based Transition: Anthropic is aggressively moving users toward API usage (pay-per-token) rather than flat-rate subscriptions to manage "token-hungry" agents.
• Reliability Risk: For businesses, Anthropic currently presents a higher platform risk due to capacity constraints compared to OpenAI. • Bullish Case for OpenAI: OpenAI is positioning itself as the "inference company," successfully serving high demand while competitors struggle with outages.
The "AI Bubble" narrative is shifting. While some analysts feared a drop in demand for chips, the "inference boom" (running the AI) is proving to be as compute-intensive as the "training boom."
• CAPEX vs. OPEX: Major tech firms (Meta, Microsoft) are shifting spending from "Neurons" (Human Headcount) to "Silicon" (Chips/Compute). • Long-term Demand: Even with more compute coming online over the next five years, the demand from agentic AI (which uses billions of tokens) is expected to outpace supply.
• Sustained Demand: The transition to "Agentic AI" provides a massive tailwind for hardware providers, as agents consume orders of magnitude more tokens than simple chatbots. • Physical Constraints: The rate of AI growth is now limited by physics (power grids, data center components) rather than just software innovation.
The shift from simple chatbots to "Agents" (AI that works autonomously for hours) is the primary driver of the current cost explosion. • Token Consumption: Individual power users are now consuming upwards of a billion tokens per month. • ROI Shift: Companies are finding the primary benefit of AI is "New Capabilities" rather than "Cost Savings," as high inference costs offset labor savings.
New specialized tools are emerging to solve the "Context" and "Cost" problems: • Blitzy: Focuses on "infinite code context" for large enterprises. • Zencoder (Zenflow): Orchestrates workflows across daily tools (Jira, Gmail, Slack). • Granola: AI-powered transcription and meeting notes.
To survive the end of subsidies, companies should: • Audit Spending: Identify where expensive models (GPT-4/Claude Opus) are being "over-qualified" for simple tasks. • Cheap Model Bake-offs: Test smaller, open-source, or older generation models (like Alibaba's Quen) for routine tasks. • Appoint a "Model Sommelier": A dedicated role to track the best performance-to-cost ratio across the rapidly changing model landscape.

By Nathaniel Whittemore
A daily news analysis show on all things artificial intelligence. NLW looks at AI from multiple angles, from the explosion of creativity brought on by new tools like Midjourney and ChatGPT to the potential disruptions to work and industries as we know them to the great philosophical, ethical and practical questions of advanced general intelligence, alignment and x-risk.