
The shift from AI training to inference marks a critical transition, suggesting investors should prioritize companies that operationalize AI and manage high-volume token delivery. OpenAI and Anthropic continue to lead the sector with massive revenue run rates of $30B and $45B respectively, proving that enterprise demand for high-level reasoning models remains robust despite rising costs. For those seeking infrastructure plays, Base 10 and OpenRouter represent high-growth "middleman" opportunities that provide essential model-agnostic routing and deployment services. With token demand outstripping supply by a 10:3 ratio, pricing power currently sits with infrastructure owners and efficiency providers rather than simple application layers. Investors should view the current "summer slowdown" as a cyclical entry point before anticipated Q4 breakthroughs, while remaining cautious of companies with high "agent debt" or unsustainable usage-based costs.
• Sam Altman has shifted his narrative regarding the "AI jobs apocalypse," now suggesting that human-to-human interaction is more resilient than previously thought and that entry-level white-collar jobs have not been eliminated as quickly as expected. • The company is transitioning from a "training" focus to an "inference" focus, with a reported revenue run rate of $30 billion. • GPT-5.5 was highlighted as a top performer in the new DeepSWE benchmark, leading in cost, speed, and token efficiency compared to competitors.
• Shift to Inference: Investors should note the strategic pivot toward serving models (inference) rather than just training them. This suggests that the "marginal dollar" in the coming years will be spent on operationalizing AI. • Model Superiority: Despite "bubble" narratives, OpenAI's latest models (5.4 and 5.5) continue to dominate high-level engineering benchmarks, maintaining a competitive moat in reasoning and self-verification.
• Anthropic is seeing a massive revenue surge, with a reported run rate of $45 billion (though accounting practices differ from OpenAI). • Their model, Claude, showed specific failure patterns in multi-part prompts (e.g., forgetting async requirements if sync was also requested) but remains a top-tier competitor. • The company is facing "token rationing" from the US government for its most powerful models due to high demand and security concerns.
• Revenue Growth: The rapid surge in revenue suggests that enterprise adoption of agentic AI is real and scaling, despite rising costs. • Niche Weaknesses: Understanding specific model failures (like Claude’s prompt adherence issues) is crucial for businesses deciding which model to deploy for specific technical tasks.
• A "neocloud" startup providing vertically integrated solutions for fine-tuning and deploying open-source models. • Currently closing a $1 billion funding round at an $11 billion valuation (doubling its value in three months). • Annualized revenue tripled from $200 million to $600 million in Q1 2024.
• Infrastructure Play: Base 10 represents a "middleman" opportunity—they don't own GPUs but add value by making them usable for production, a high-growth sector as companies move past experimentation.
• A token routing service that allows users to access multiple AI models through a single API. • Recently became a "unicorn" with a $1.3 billion valuation following a $113 million Series B led by Capital G (Alphabet). • Serving 100 trillion tokens per month, a 5x increase in six months.
• Model Agnosticism: As the market becomes more fragmented, services like OpenRouter that provide redundancy and cost-optimization across different models (OpenAI, Anthropic, etc.) are seeing exponential growth.
• Context: The "subsidy era" of AI is ending. Companies are moving from flat monthly fees to usage-based (pay-per-token) models because agentic AI is more expensive to run than anticipated. • Sentiment: Bullish on infrastructure and efficiency; Bearish on "vibe-coded" apps that lack a clear ROI. • Insight: Demand for tokens is growing at 10x annually while supply (inference capacity) is only growing at 3x. This supply-demand imbalance suggests pricing power remains with the providers and infrastructure owners.
• Context: Every summer, a narrative emerges that AI has "hit a wall" or is a bubble. • Insight: The podcast argues these panics are cyclical and usually followed by major breakthroughs in Q4. Current "plateaus" in tool adoption (like VS Code) may simply be users moving to different interfaces (CLI/Terminal) rather than a drop in interest.
• Context: Traditional benchmarks are becoming "saturated" or "gamed." New benchmarks like DeepSWE (by DataCurve) are emerging to test long-horizon, real-world engineering tasks. • Insight: Chinese models (like Kimi and DeepSeek) currently lag significantly behind US models in complex, multi-file engineering tasks, despite performing well on simpler benchmarks.
• Agent Debt: A new form of "technical debt" where hacked-together AI workflows become unmanageable and polluted over time. • Token Shortages: High costs and limited availability of tokens could lead to "AI inequality," where only the most well-resourced companies can afford the most powerful models. • Budget Exhaustion: Companies like Uber have reported burning through annual AI budgets in months, leading to a temporary pullback in spending as they reassess ROI.

By Nathaniel Whittemore
A daily news analysis show on all things artificial intelligence. NLW looks at AI from multiple angles, from the explosion of creativity brought on by new tools like Midjourney and ChatGPT to the potential disruptions to work and industries as we know them to the great philosophical, ethical and practical questions of advanced general intelligence, alignment and x-risk.