133. 对谢赛宁的7小时马拉松访谈：世界模型、逃出硅谷、AMI Labs、两次拒绝Ilya、杨立昆、李飞飞和42 | 张小珺Jùn｜商业访谈录

133. 对谢赛宁的7小时马拉松访谈：世界模型、逃出硅谷、AMI Labs、两次拒绝Ilya、杨立昆、李飞飞和42

Podcast6 hr 45 min

Listen to Episode

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

Investors should prioritize Physical AI and World Models as the next major frontier beyond current text-based LLMs, focusing on companies that bridge the gap between digital intelligence and the physical world. Keep a close watch on AMI Labs, a high-conviction startup co-founded by Yann LeCun, which is currently raising capital at a $1 billion valuation to build "Predictive Brains" for robotics and healthcare. While Meta (META) remains a talent powerhouse, be aware of potential "brain drain" as top researchers leave to join agile startups focused on fundamental spatial intelligence. OpenAI continues to lead in product execution, specifically through the Diffusion Transformer (DiT) architecture used in Sora, which has now become the industry standard for video generation. For long-term growth, look toward Robotics and Industrial AI firms that own proprietary sensor and video data, as high-quality visual data will be the most valuable resource for training the next generation of autonomous systems.

Detailed Analysis

This investment analysis extracts key insights from the interview with Xie Saining (谢赛宁), a prominent AI scientist, NYU professor, and co-founder of the new AI startup AMI Labs.

AMI Labs (Startup)

• AMI Labs is a newly launched AI research laboratory/startup co-founded by Xie Saining and Yann LeCun (Chief Scientist at Meta). • The company recently completed its first major funding round, targeting a valuation/capital goal of $1 billion. • Focus: Building "World Models" and the "Predictive Brain" rather than just Large Language Models (LLMs). • Structure: Operates with four global offices: New York, Paris, Montreal, and Singapore. • Team: Composed of elite researchers from OpenAI, Google DeepMind (GDM), and Meta.

Takeaways

• Investment Theme: AMI Labs represents a "contrarian" bet against the current LLM-centric narrative. It focuses on Spatial Intelligence and Physical World Understanding, which are seen as the next frontier beyond text-based AI. • Strategic Moat: By positioning itself as a "refuge" for researchers who want to do fundamental science outside the product-cycle pressures of Big Tech (Google/Meta), AMI Labs aims to attract top-tier talent that is currently "mission-driven" rather than purely IPO-driven. • Business Model: Unlike pure research labs, AMI seeks to build a "Universal World Model" that can be applied to vertical domains like robotics, industrial process control, and healthcare (e.g., monitoring elderly care via wearable vision).

World Models & Physical AI (Investment Theme)

• The discussion highlights a shift from Digital Intelligence (LLMs like ChatGPT) to Physical Intelligence (World Models). • Key Concept: LLMs are seen as "communication interfaces" or "crutches." They are excellent at language but lack "common sense" about the physical world (gravity, intuitive physics, spatial relationships). • The "Bitter Lesson": Xie suggests that while scaling computation is important, the next breakthrough requires modeling Pixel/Visual data directly to understand the world, rather than relying on human-written text.

Takeaways

• Sector Growth: Investors should look toward companies bridging the gap between AI and the physical world. This includes Robotics, Autonomous Systems, and Industrial AI. • Risk Factor: The "Data Wall." While the internet has been "dumped" for text, high-quality video data for training world models is harder to access due to copyright and YouTube's terms of service. Companies that own proprietary "real-world" data (sensor data, industrial video) hold a significant advantage.

Meta (META)

• Mentioned extensively in the context of FAIR (Fundamental AI Research). • Internal Dynamics: The transcript reveals a tension between "Bottom-Up" research (researchers choosing projects) and "Top-Down" product requirements (competing with OpenAI). • Asset Mention: Llama and JEPA (Joint-Embedding Predictive Architecture).

Takeaways

• Sentiment: Neutral/Bullish on talent, but highlights a "Resource Allocation" risk. Big Tech firms are currently funneling most resources into the "LLM war," potentially neglecting the next wave of "World Model" research. • Key Personnel: The departure of high-level talent like Xie Saining to start AMI Labs suggests a "brain drain" from established giants toward specialized, agile startups.

OpenAI (Private)

• Mentioned regarding the development of Sora and the recruitment of researchers like Bill Peebles (co-author of the DiT paper with Xie). • Context: OpenAI is praised for its ability to take a research idea (like DiT - Diffusion Transformers) and rapidly scale it into a world-class product (Sora).

Takeaways

• Competitive Edge: OpenAI’s strength lies in its "Product-Research Alignment"—the ability to turn academic breakthroughs into dominant market products faster than traditional academic or corporate labs. • Technological Shift: The success of Sora validates the DiT (Diffusion Transformer) architecture, which Xie Saining helped pioneer. This architecture is becoming the industry standard for video generation.

Robotics & Hardware

• The transcript identifies Robotics as the primary "downstream application" for World Models. • Current State: Described as a "desert" for general-purpose utility. Current robots are mostly for entertainment or specific industrial tasks.

Takeaways

• Investment Insight: The "Brain" (AI model) is currently ahead of the "Body" (Hardware). The real investment opportunity lies in the eventual integration of a "General World Model" into robotic hardware. • Timeline: Xie suggests that a "General Purpose Robot" (capable of taking care of the elderly or performing household chores) is still a long-term goal, not an immediate reality.

Key Technical Terms for Investors

• DiT (Diffusion Transformer): The underlying architecture for modern video generation (used in Sora). • Scaling Law: The principle that increasing data and computing power leads to better AI performance. Xie notes that "Visual Scaling Laws" may differ from "Language Scaling Laws." • JEPA: A non-generative approach to AI favored by Yann LeCun, focusing on predicting high-level concepts rather than every single pixel, aimed at making AI more efficient and controllable.

Ask about this postAnswers are grounded in this post's content.

Episode Description

2026年春节，在中国机器人登上春晚的喜乐时分，纽约刚下过一场暴雪——这是近几年以来纽约最凛冽的一个冬天。 2月雪后的一天，在布鲁克林一栋略显凌乱的楼房，我与谢赛宁开启了这场料未及的马拉松式访谈。我们从下午2点开始，直到凌晨时分散去。就在前不久，他与图灵奖得主杨立昆（Yann LeCun）等，踏上了一条关于“世界模型”的创业旅程。他们创立的AMI Labs（Advanced Machine Intelligence Labs），目前仅25人、在没有任何产品的情况下，完成10.3亿美元Seed轮融资，投前估值35亿美元。 “Silicon Valley is very LLM-pilled。”AMI联合创始人兼首席科学官谢赛宁说道，“硅谷已经深陷于LLM（大语言模型），完全被它催眠了。” 这位出生于1990年的华人科学家，毕业于上海交通大学与加利福尼亚大学圣地亚哥分校，现于纽约大学任教。创业之前，他曾在Google DeepMind担任研究科学家。更早之前，他在Meta的FAIR实验室担任研究科学家4年。他的论文总计引用数近10万次，曾共同提出Diffusion Transformers（DiT）。这是谢赛宁第一次接受访谈。在黑暗浸透的霓虹纽约街头，空气中弥漫着没化完冰雪与难闻烟雾的混合味道。就像谢赛宁的表达，总是带着多重的混合感。 OUTLINE: 00:01:19 The normal one 00:35:40 世界总不让我做Vision 00:52:06 学术流浪 00:57:43 与何恺明的友谊 01:05:35 两次拒绝了Ilya 01:08:26 杨立昆和李飞飞往事 01:12:18 草蛇灰线：“表征的世界” 02:43:55 Research taste与《金刚经》 04:11:07 世界模型是什么？ 04:29:47 从下载互联网，到下载人类 04:58:17 和杨立昆创立AMI始末 05:45:53 “硅谷被催眠了” 06:07:17 自大的人类！ 06:18:45 “42” 免责声明：本内容不作为投资建议。联络我：xiaojunzhang@lisw.ai

About 张小珺Jùn｜商业访谈录

张小珺Jùn｜商业访谈录

By 张小珺

努力做中国最优质的科技、商业访谈。张小珺：财经作者，写作中国商业深度报道，范围包括AI、科技巨头、风险投资和知名人物，也是播客《张小珺Jùn | 商业访谈录》制作人。如果我的访谈能陪你走一段孤独的未知的路，也许有一天可以离目的地更近一点，我就很温暖：）