138. 对罗福莉3.5小时访谈：AI范式已然巨变！OpenClaw、Agent范式很吃后训练、卡的分配、组织平权 | 张小珺Jùn｜商业访谈录

138. 对罗福莉3.5小时访谈：AI范式已然巨变！OpenClaw、Agent范式很吃后训练、卡的分配、组织平权

Podcast3 hr 36 min

Listen to Episode

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

The next 2–3 months represent a critical window for the transition from simple chatbots to AI Agents, with value shifting toward framework layers like OpenClaw and Llama. Investors should prioritize companies utilizing efficient architectures like MTP and MIA (e.g., DeepSeek and Xiaomi), as these "cost-performance" leaders are narrowing the gap with US models to just 2–3 months. Expect a 10x surge in demand for inference-specific hardware as persistent agents begin scanning screens and performing long-term tasks. The most reliable ROI over the next 24 months will likely come from AI for Software Engineering, as coding provides the most stable environment for AI self-evolution. Avoid AI applications in sectors with "messy" feedback loops, such as certain quantitative finance models, where the lack of clear rewards hinders model training.

Detailed Analysis

This analysis extracts investment insights from the interview with Luo Fuli (Roderick), a key AI figure at Xiaomi, regarding the massive paradigm shift in AI agents, model training, and the competitive landscape for 2025-2026.

AI Agents & Open-Source Frameworks (OpenClaw)

The discussion highlights a fundamental shift from "Chat" models to "Agent" models. The guest emphasizes that the next 2–3 months will be a "spectacular" period for AI agent evolution.

OpenClaw Impact: Mentioned as a pivotal product/framework that allows for better agent structure, long-term memory, and self-correction.
Paradigm Shift: AI is moving from simple dialogue to complex task execution (Agents). This requires a change in model design—moving away from just "scaling" to focusing on "post-training" and "agentic workflows."
Customization: Unlike closed systems (like Claude 4.6 Ops), open frameworks allow developers to change the entire multi-agent logic, which is seen as a "skill" rather than just a "tool."

Takeaways

Investment Theme: Look for companies transitioning from "LLM as a chatbot" to "LLM as an OS/Agent." The value is shifting toward the Agent Framework layer.
Actionable Insight: Open-source ecosystems (like those surrounding Lama or OpenClaw) are becoming "amplifiers" for mid-tier models, allowing them to compete with top-tier closed models by optimizing the agent architecture.

Computing Power & Hardware Allocation (GPUs)

The interview provides a specific "formula" for how leading AI teams are now allocating their hardware resources, which differs significantly from the early days of the AI boom.

The 3:1:1 Ratio: A "good" allocation of computing cards (GPUs) is now 3 (Pre-training) : 1 (Post-training) : 1 (Research/Experiments).
Research is the Bottleneck: The guest notes that research requires a massive amount of "cards" to test new paradigms. Without dedicated research cards, teams cannot innovate on the next architecture.
Inference Demand: As agents become more powerful and "scan" screens or perform long-term tasks, inference demand is expected to rise by 10x.

Takeaways

Bullish on Inference Chips: While training chips (NVIDIA) remain dominant, the shift toward persistent agents will drive massive demand for Inference-specific hardware and low-cost token production.
Risk Factor: Companies that do not allocate at least 20% of their compute to "Research" (the 3:1:1 ratio) risk falling behind as the "Pre-training" era hits diminishing returns.

Model Architectures (MTP & MIA)

The technical discussion touches on specific architectural choices that determine the cost and speed of AI models.

MTP (Multi-Token Prediction): Used in the Flash and Pro series to increase the ability of the model to predict and reason. It significantly lowers the cost per token if the "hit rate" is high.
MIA (Multi-Head Latent Attention): A structure used by teams like Kimi and DeepSeek. It is highly efficient for KV cache management, making models faster and cheaper to run.
Hybrid Attention: The guest believes a hybrid structure (combining sliding windows and full attention) is more "elegant" and provides more room for Agents to function effectively in the future.

Takeaways

Efficiency as a Moat: In 2025, the "winner" may not be the largest model, but the one with the most efficient architecture (MTP/MIA) that allows for the lowest Token Cost.
Xiaomi (MIMO/Flash): Positioned as the "early Xiaomi phone" of AI—focusing on extreme cost-performance (low price, high speed).

Competitive Landscape: US vs. China

The guest provides a timeline for the gap between Chinese AI models and top-tier US models (like Claude 3.5/4.0).

The 2-3 Month Gap: The guest claims that leading Chinese teams (mentioning Kimi, Mimo/Xiaomi, and DeepSeek) are only about 2 to 3 months behind the latest US releases (like Claude 4.6 Ops) in terms of actual effect and performance.
2026 Outlook: By 2026, the "table stakes" for staying in the game will be the ability to integrate agent structures directly with the model's core.

Takeaways

Sector Sentiment: Highly bullish on the agility of Chinese "Post-training" teams. They are described as being able to achieve "驚艷" (stunning) results with 1/10th the model size of US competitors.
Key Players to Watch: DeepSeek (for architectural innovation), Kimi/Moonshot (for long-context), and Xiaomi/Mimo (for infrastructure and cost-efficiency).

AGI Timeline & Investment Risks

AGI Arrival: The guest estimates AGI is roughly 2 years away.
Definition of AGI: It is defined here as when AI significantly changes every person's life and work through "persistent" agents that don't need rest and evolve on a single curve.
Risk - "The Reward Problem": In fields like quantitative finance (factor mining), the "reward" is often unclear. AI models struggle when the feedback loop is messy. Investors should be cautious of AI applications in sectors where the "ground truth" or "reward" is not easily defined.

Takeaways

Timeline: Investors should look for significant ROI from AI integrations within a 24-month window.
Focus Area: The most "elegant" path forward is Coding. Code is natural language, it is easily scalable, and it provides a perfect environment for AI to learn and "self-evolve." Companies focusing on AI for Software Engineering are likely the safest bets.

Ask about this postAnswers are grounded in this post's content.

Episode Description

2026年，大模型战争全面升级，掀开了第二幕——从Pre-train（预训练）主导的Chat时代，转向Post-train（后训练）主导的Agent时代。在AI范式巨变之际，我访谈了人工智能研究员罗福莉。罗福莉曾供职阿里达摩院、DeepSeek，目前是小米大模型团队负责人，主导研发了MiMo-V2系列模型。她在网络空间有很多标签，例如“AI天才少女”，但她不喜欢这个称呼。这次是她的第一次访谈，也是她第一次进行长时间的技术访谈。我们系统性地谈论了，2026年由Claude Opus 4.6、OpenClaw等技术变量所触发的AI巨震，以及后续结构性影响。在这个生产力大爆炸的时代，人人都有危机感。哪怕是对于亲手训练模型的研究者来说。 “我之前认为我们自己做的工作已经足够有创造力、足够不会被Skill化、不会被Workflow化。但我现在发现，它竟然也能！那它可不可以训出更强的模型？自己左脚踩右脚就提升了？——这是这一两年会发生的事情。” 当人类的知识与智慧内化为模型能力，未来的人类去做什么？我们的社会如何消化这次剧烈的技术变革？——这些宏大的时代命题我们不得而知。但无论如何，这仍然是一次信息密度极大的访谈——你能从中看见，当面临一次巨大技术范式转折时，一家AI Lab内部，在技术押注、资源调配、组织与人员等诸多方面的系列举措。而它应对巨变的根基是，文化与价值观的成型。罗福莉对当下有一些关键的技术判断： Anthropic的路径是正确的，这是当下共识。在路径更清晰的情况下，国内大模型团队进入加速追赶的状态。现在大家在Pre-train上的代差是基本没有的，或者说非常接近。现在至少跟23年要去追平Pre-train的差距一样，大家很all in，要去做好Agent的Post-train。更具体说，是在Agent上怎么做好RL的scaling。系统从“以Rollout推理引擎为核心”，转变为“以Agent为核心”的一个更复杂系统。这对团队提出了更高的要求：必须具备足够敏捷性，能够快速开发出适配当前时代的RL Infra系统。接下来两三个月，大家怎么发生变化，是考验团队整体研究水平、技术敏捷程度，以及怎么拥抱新的范式来做研究的关键。一个for更长期的事情：我们不会在1T水平上走太久。如果要拿到下一个阶段的领先，就要寻求更大规模scaling。到底是去scaling模型的参数量，还是去scaling什么东西？以及要在什么样的芯片上去scaling？——这是当下立即需要去决策和判断的，这才决定了大半年过后谁更领先。在卡的调配上：至少在Chat时代，for研究、for Pre-train和for Post-train的用卡比例非常夸张，比如3比5比1，现在一个非常合理的用卡比例可能是3比1比1。预训练跟后训练一个比例，这是今年可能发生的很大变化。顶尖团队应该都是1比1了。在组织的重组上：做后训练现在一个重要的范式变化是，需要具备diversity（多样性），让预训练的人做后训练是个很好的补充。 “接下来两三个月会非常精彩。”罗福莉称。接下来，就是我对罗福莉的访谈。 OUTLINE: 00:02:16 OpenClaw引发巨变 00:24:17 群体智能提升Agent框架 00:41:31 2026是生产力变革之年 01:01:45 Agent的自进化与自迭代 01:19:39 MiMo-V2：觉醒和伏击 01:45:24 1T模型是入场券 01:52:33 组织平权 02:02:56 训练细节和成本 02:09:03 另类架构 02:22:32 AI没有生存危机 02:39:12 每天在否认昨天的自己 02:48:34 过去3年的AI进化史 03:05:54 当下共识与竞争 03:19:45 环境比经验更重要 LINKS：我们的播客在小宇宙、Apple Podcast、Spotify等全音频平台播出；我们的视频播客在Bilibili、小红书、视频号、抖音等全视频平台播出；如果你想服用文字版，请搜索我们工作室的公众号：语言即世界language is world。本集文字版：《独家对话罗福莉：AI范式已然巨变！》 DISCLAIMER: 本内容不作为投资建议。 CONTACT: xiaojunzhang@lisw.ai Jump into the new world-and explore with us!😉

About 张小珺Jùn｜商业访谈录

张小珺Jùn｜商业访谈录

By 张小珺

努力做中国最优质的科技、商业访谈。张小珺：财经作者，写作中国商业深度报道，范围包括AI、科技巨头、风险投资和知名人物，也是播客《张小珺Jùn | 商业访谈录》制作人。如果我的访谈能陪你走一段孤独的未知的路，也许有一天可以离目的地更近一点，我就很温暖：）