Why Hardware-Software Co-Design Is AI's Real 100x: Dylan Patel of SemiAnalysis | Sequoia Capital | Kazuha

Why Hardware-Software Co-Design Is AI's Real 100x: Dylan Patel of SemiAnalysis

Why Hardware-Software Co-Design Is AI's Real 100x: Dylan Patel of SemiAnalysis

6 hours ago•Sequoia Capital•@sequoiacapital

YouTube1 hr 10 min

Watch on YouTube

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

Maintain a core position in NVIDIA (NVDA) as it remains the industry standard, with the upcoming Blackwell and Rubin architectures expected to deliver up to 30x performance improvements. For investors seeking value in custom silicon, Google (GOOGL) and Amazon (AMZN) offer high-conviction alternatives through their TPU and Trainium programs, which provide superior cost-efficiency for large-scale AI training. Monitor Broadcom (AVGO) as a key beneficiary of the "make vs. buy" trend, as they partner with hyperscalers to design these increasingly vital custom ASICs. High-growth opportunities exist in "NeoClouds" like CoreWeave or Nebius, which outperform traditional cloud providers by building data centers specifically optimized for AI workloads. To hedge against the looming power bottleneck, look toward energy infrastructure and companies capable of integrating high-bandwidth memory (HBM) directly onto logic chips to solve critical hardware constraints.

Detailed Analysis

NVIDIA (NVDA)

• NVIDIA is described as the "jack-of-all-trades" in the semiconductor space, maintaining a significant lead due to its general-purpose nature and massive ecosystem. • Hardware-Software Co-design: NVIDIA is moving beyond just chips to optimize the entire stack from silicon to the model layer. • Market Strategy: Jensen Huang is actively supporting "NeoClouds" (specialized AI cloud providers) and various AI labs to ensure a "multipolar world." This prevents hyperscalers (Google, Amazon) from having too much power and keeps demand for GPUs high across diverse customers. • Product Roadmap: Mention of the transition from Hopper to Blackwell (30x improvement in some metrics) and future chips like Rubin and Rubin Ultra, which may reach power levels of 4,000 watts.

Takeaways

• Bullish Sentiment: NVIDIA remains the industry standard because most open-source models and Chinese AI labs co-optimize their software specifically for NVIDIA hardware. • Competitive Moat: While the "CUDA moat" (software programming) is weakening because AI can now help write code for other chips, NVIDIA’s real moat is the downstream ecosystem—most new models are designed to run optimally on NVIDIA first. • Risk Factor: Large labs (OpenAI, Anthropic) are increasingly building their own custom chips (ASICs) to save costs, which could eventually eat into NVIDIA's market share for specific high-volume workloads.

Google (GOOGL) - TPU (Tensor Processing Unit)

• Google is expected to produce over 10 million TPUs through its supply chain in the next two years, representing a $100+ billion hardware effort. • Specialization: TPUs are often more energy-efficient and have better networking (ICI) for certain large-scale training tasks compared to GPUs. • Diversification: Google is running three different design programs for TPUs simultaneously (with partners like Broadcom and MediaTek) to avoid getting stuck in a "local minima" (a technology dead-end).

Takeaways

• Efficiency Play: TPUs are "objectively amazing" for specific models (like Google’s Gemini or Anthropic’s training), but they "suck" at running models designed specifically for NVIDIA (like the Chinese DeepSeek models). • Investment Insight: Google is a major player in the "make vs. buy" transition. Even though they have TPUs, they still rent NVIDIA GPUs from others (like XAI) when they need general-purpose capacity, showing that the AI boom is lifting all boats.

Amazon (AMZN) - Trainium & Inferentia

• Amazon’s custom AI chips, Trainium, are becoming highly competitive. • Performance: Anthropic (a major AI lab) uses Trainium heavily and has helped write the libraries to make the hardware useful. • Cost Advantage: Trainium is rented at a lower rate (sub-$10 billion per gigawatt) compared to NVIDIA GPUs, making it an attractive low-cost alternative for high-volume AI training.

Takeaways

• Cloud Evolution: Amazon is moving past its "Cloud Crisis" where traditional networking (Nitro) initially hindered AI performance. • Strategic Partnership: The success of Amazon’s silicon is deeply tied to Anthropic. As Anthropic grows, Amazon’s chip ecosystem becomes more validated and valuable.

Specialized AI Infrastructure (NeoClouds)

• Companies like CoreWeave, Cruso, and Nebius are identified as "NeoClouds." • Performance Edge: These specialized providers often offer better performance and reliability than big hyperscalers (AWS/Azure) because they build data centers specifically for AI, without the "baggage" of traditional cloud security and networking that can slow down GPUs. • High Leverage: These companies are growing extremely fast, often using high levels of debt to fund massive GPU purchases.

Takeaways

• Investment Theme: The "NeoCloud" opportunity exists because big tech companies were too slow to adapt their data center designs for the massive power and networking needs of AI. • Risk Factor: This is a "Wild West" sector. While some will become giants, many are hyper-leveraged and could fail if AI model progress plateaus or if capital markets tighten.

Emerging Investment Themes & Sectors

1. Space-Based Data Centers

• Timeline: 10–20 years. • Context: As terrestrial power becomes a bottleneck (AI could require terawatts of power by 2040), the majority of incremental compute may move to space. • Key Player: SpaceX is mentioned as a potential dominant force here due to their expertise in Starlink (networking) and Tesla (power management).

2. Hardware-Software Co-Design

• The "100x" Opportunity: The biggest gains in AI efficiency aren't coming from just a better chip or a better model, but from designing them together. • Insight: Investors should look for companies that control the whole stack (e.g., Apple-style integration for AI).

3. The "Compute Crunch" and Energy

• Demand: AI models are expanding their capabilities (and economic value) faster than we can build data centers. • Energy Solutions: A massive bottleneck in power exists. Innovative solutions mentioned include converting diesel truck engines into on-site power generators for data centers to bypass grid delays.

4. Memory Technology

• Bottleneck: Memory bandwidth is a major constraint. • Innovation: Look for companies working on "stacking" memory directly on top of the logic chip (HBM integration) to explode bandwidth speeds.

Notable Private Companies Mentioned

DeepSeek: A Chinese AI lab noted for extreme efficiency and co-optimizing models for specific hardware.
Anthropic: Reported to be reaching profitability (net income positive excluding stock comp) due to high margins on AI tokens.
Cerebras / Grok: Companies doing "weird" (innovative) hardware designs using SRAM-based chips for ultra-fast inference.
MosaicML (acquired by Databricks): Highlighted for early innovation in the software/hardware abstraction layer.

Ask about this postAnswers are grounded in this post's content.

Video Description

Dylan Patel, founder of SemiAnalysis, argues the biggest gains in AI don't come from faster chips, they come from software-hardware co-design. Optimizing the model, the kernels, and the silicon together turns a 2x here and a 2x there into 100x. He explains why DeepSeek's experts were shaped for Nvidia's Hopper (and why TPUs struggle to run it), why OpenAI's sparser models and Anthropic's denser ones pull them toward different hardware, and why the so-called CUDA moat was never really about CUDA. Dylan breaks down InferenceX, his living benchmark that runs the latest models on over $50M of donated hardware daily, tracking a roughly 60x annual drop in cost per unit of quality. He makes the case that inference will be a bigger market than oil, that the compute crunch persists because models expand the value of useful work faster than compute grows, and why Jensen Huang is bankrolling neoclouds to engineer a multipolar world. Hosted by Shaun Maguire and Sonya Huang, Sequoia Capital 00:00 Introduction 01:58 Motel Kid Origins 03:11 Xbox Repair Spark 04:23 Internet Forums to Semis 06:42 From Quant to Founder 09:16 Homeless Research Roadtrip 14:04 InferenceX and Benchmarking 34:35 Sparse vs Dense Models 35:08 Interconnect Shapes Architecture 35:48 CUDA Moat Is Shifting 36:46 Ecosystems and Co-Design 38:46 Cerebras Speed and Limits 42:07 ROI Debates and Hot Takes 44:20 Ten Year Tech Bets 50:48 Compute Crunch and NeoClouds

About Sequoia Capital

Sequoia Capital

Sequoia Capital

By @sequoiacapital

Sequoia helps daring founders build legendary companies from idea to IPO and beyond. We aim to be the first true believers in tomorrow’s most consequential companies. We partner with a few outliers each year and go all-in, providing them with the hands-on help required at every stage of the company building journey. Our expertise comes from nearly 50 years of working with legendary founders like Steve Jobs, Elon Musk, Larry Page, Jan Koum, Brian Chesky, Tony Xu, Lin Qiao, Eric Yuan, Christina Cacioppo, and Patrick Collison. In aggregate, Sequoia-backed companies account for more than 30% of NASDAQ's total value. The vast majority of the money we invest has been on behalf of nonprofits and schools like the Ford Foundation, Mayo Clinic and MIT, which means most of the returns we generate benefit these great causes.