How François Chollet Is Building A New Path To AGI | Y Combinator Startup Podcast

How François Chollet Is Building A New Path To AGI

43 days ago•Y Combinator Startup Podcast•Y Combinator

Podcast57 min 23 sec

Listen to Episode

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

Investors should prioritize AI-native software engineering tools like GitHub Copilot and Cursor, as coding is the first domain to reach full automation through verifiable reinforcement learning. Focus on companies building "harnesses" and self-improving loops that allow AI to learn without human annotators, as these will scale faster than traditional data-heavy models. Look for exposure to State-Space Models (SSMs) and startups specializing in algorithmic efficiency and distillation, which aim to replace massive, expensive LLM clusters with smaller, "optimal" codebases. High-conviction opportunities lie in "verifiable" sectors like Quantitative Finance, Mathematics, and Legal Verification, where AI can independently validate its own accuracy. Monitor the ARC-AGI benchmark to identify leaders in "Agentic AI," with a target window of 2030 for foundational shifts toward human-level fluid intelligence.

Detailed Analysis

India (Research Lab)

• India is a new AGI (Artificial General Intelligence) research lab founded by François Chollet, focusing on a "symbolic" alternative to current deep learning methods. • The lab aims to move away from parametric curves (used in Large Language Models) and toward program synthesis and symbolic models. • Key Technical Shift: Instead of using gradient descent to fit curves to data, India uses "symbolic descent" to find the shortest, most concise symbolic model (code/equations) to explain data. • Efficiency Gains: This approach is designed to be "optimal," requiring significantly less data and compute than current LLMs while generalizing better to new, unseen tasks.

Takeaways

• Investment in "Small" AI: While the industry is currently pouring billions into massive LLM clusters, India represents a bet on algorithmic efficiency over raw scale. Investors should watch for a shift from "more parameters" to "better logic." • High Risk/High Reward: Chollet estimates a 10-15% chance of success, but notes that if successful, it would leapfrog the current AI stack entirely. • AGI Timeline: Chollet predicts reaching AGI by 2030, suggesting a 5-6 year window for foundational shifts in the AI market.

ARC-AGI Benchmark (ARC)

• ARC-AGI is a benchmark designed to measure "fluid intelligence" (the ability to learn new things) rather than just "crystallized intelligence" (memorized training data). • ARC v1 & v2: These versions focused on static pattern matching. OpenAI’s o1 and o3 models showed a step-function improvement here by using "reasoning" (Chain of Thought). • ARC v3: The latest version measures agentic intelligence. It drops an AI into a mini-video game environment with no instructions; the AI must explore, identify goals, and solve the game efficiently. • Human-Level Efficiency: The goal is for AI to match human "sample efficiency"—learning a task in hundreds of actions rather than millions of hours of gameplay.

Takeaways

• The "Reasoning" Premium: Companies that can solve ARC v3 (like OpenAI, Anthropic, or specialized startups) will likely lead the next wave of "Agentic AI"—systems that can actually do work in the real world rather than just talk about it. • Verifiable Domains: AI progress is currently fastest in "verifiable" fields like Computer Code and Mathematics because the AI can check its own work. Expect these sectors to be fully automated first.

Coding Agents & Software Engineering

• The podcast highlights a "viral moment" for coding agents (e.g., G-Stack, GitHub Copilot, Cursor) because code provides a "verifiable reward signal." • RL Loop: Models are now being trained via Reinforcement Learning (RL) where they try to write code, run unit tests, and learn from the failures automatically. • Saturation: Benchmarks like ARC v2 have been "saturated" (solved at 97%+) by startups like Confluence Lab using custom "harnesses" that structure problems for LLMs.

Takeaways

• Bullish on AI Software Engineering: Coding is the "first domain to fall" to full automation. Investment in AI-native dev tools remains a high-conviction theme. • The "Harness" Opportunity: There is a massive near-term opportunity for startups to build "harnesses"—software layers that translate messy real-world problems into verifiable formats that current LLMs can solve.

Investment Themes & Sector Insights

The "Optimal AI" Stack

• Theme: Moving from "Brute Force" to "Elegance." • Insight: Chollet suggests that AGI might eventually be a codebase of less than 10,000 lines of code that operates on a massive knowledge base. • Actionable: Look for startups focusing on distillation (making small models as smart as big ones) and State-Space Models (SSMs) like Jamba or Mamba, which offer alternatives to the standard Transformer architecture.

Verifiable vs. Non-Verifiable Domains

• Bullish (Fast Progress): Software Engineering, Mathematics, Quantitative Finance, and Legal Document Verification. These have clear "right/wrong" signals. • Bearish/Slow (Stalling Progress): Creative Writing, Essay Composition, and General Philosophy. These rely on human "vibes" and lack a formal verification loop, meaning progress will be slower and more expensive.

The "Human-in-the-Loop" Bottleneck

• Insight: For an AI approach to scale, it must remove the human bottleneck. • Takeaway: Avoid companies that require massive teams of human annotators to improve their models. Favor companies building self-improving loops where the AI generates its own training data through environment interaction.

Career & Personal Investment

• Insight: AI progress is "too late to stop." • Takeaway: The highest ROI for individuals is not just learning AI, but becoming a "power user" who can leverage AI to automate their specific domain expertise. Expertise + AI = Empowerment.

Ask about this postAnswers are grounded in this post's content.

Episode Description

François Chollet has spent years asking a different question than most of the AI world. Instead of scaling what already works, he’s trying to understand what intelligence actually is—and how to build it from first principles. In this episode of Lightcone, he traces that path from his early work on deep learning to the creation of the ARC prize, and the launch of ARC V3, a new benchmark designed to measure something deeper than performance: the ability to learn, adapt, and reason efficiently in entirely new environments. He explains why today’s systems may be hitting limits, what recent breakthroughs really mean, and why reaching true general intelligence may require a fundamentally different approach.00:00 - AGI by 2030?00:31 - Introducing Ndea: A New Path Beyond Deep Learning01:08 - A New ML Paradigm 01:30 - Replacing neural nets with compact symbolic programs03:04 - Why Ndea Isn’t Competing With Coding Agents05:20 - Why Everyone Might Be Wrong About Scaling LLMs07:22 - Why Coding Agents Suddenly Work So Well08:50 - The Limits of LLMs in Non-Verifiable Domains10:48 - What AGI Actually Means (And Why Most Definitions Are Wrong)13:30 - Why Deep Learning Hits a Wall 14:00 - ARC’s Origin Story18:20 - ARC Benchmarks Explained: From V1 to V322:49 - The RL Loop Powering Coding Agents Today27:03 - ARC-AGI V3: Measuring “Agentic Intelligence”31:14 - Inside the ARC Game Studio35:31 - Could AGI Fit in 10,000 Lines of Code?44:01 - Building Ndea: From Idea to Compounding Research Stack46:46 - The Future of ARC: Benchmarks That Evolve With AI47:21 - Why There’s Still Huge Opportunity for New AI Paradigms53:37 - How to Build a Breakout Open Source Project - Lessons From Kera56:39 - Advice For How To Think About AIApply to Y Combinator: https://www.ycombinator.com/applyWork at a startup: https://www.ycombinator.com/jobs

About Y Combinator Startup Podcast

Y Combinator Startup Podcast

By Y Combinator

We help founders make something people want.