139. 【Agent的综述】和苏煜聊Agent技术史、OpenClaw Moment、边界的消弭和社会的辐射 | 张小珺Jùn｜商业访谈录

139. 【Agent的综述】和苏煜聊Agent技术史、OpenClaw Moment、边界的消弭和社会的辐射

Podcast2 hr 17 min

Listen to Episode

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

Investors should prioritize Big Tech leaders like Microsoft (MSFT), Alphabet (GOOGL), and NVIDIA (NVDA) as they provide the essential "frontier models" and hardware for the shift from AI tools to autonomous "AI Labor." Focus on Anthropic (via Amazon or Google partnerships) for its leadership in "Computer Use" technology, which allows agents to automate legacy software without complex integrations. Look for high-conviction opportunities in Vertical AI startups that solve "hard problems" in legal, medical, or coding sectors, as these specialized "Expert Agents" maintain a stronger competitive moat than general chat models. Monitor the 2025-2026 timeframe for the commercial rollout of "Self-Learning" agents, which is expected to significantly optimize enterprise business processes and high-end knowledge work. Be cautious of "simple" AI startups, as general models like GPT-4o and Claude 3.5 are rapidly absorbing basic features and displacing under-capitalized competitors.

Detailed Analysis

This financial analyst report extracts key investment insights from the interview with Yu Su, a professor at Ohio State University and founder of Neocognition, regarding the evolution of AI Agents and the "OpenCloud Moment."

AI Agents (The "Language Agent" Era)

The discussion highlights a fundamental shift from traditional AI to "Language Agents." Unlike previous iterations, these agents use Large Language Models (LLMs) as a "scaffold" for reasoning, memory, and autonomy.

The "OpenCloud Moment": Similar to the "ChatGPT moment," this refers to the sudden public realization of Agent capabilities (specifically via Anthropic’s Claude and its computer-use features).
Core Capabilities:
- Autonomy: The ability to perceive, reason, and make decisions independently.
- Memory: Moving beyond static data to "episodic" and "procedural" memory (learning how to do things).
- Adaptive Computing: Using language to scale the amount of "compute" applied to a problem (e.g., generating more tokens for complex reasoning via Chain of Thought).

Takeaways

Investment Theme: We are moving from "AI as a Tool" to "AI as Labor." Companies are shifting from selling software seats to selling "AI Employees" or outcomes.
Sector Impact: High-end knowledge work and enterprise business processes are the immediate targets for displacement and optimization.

Specialized vs. General Agents

A major debate in the transcript is whether the market will be dominated by "General Agents" (all-in-one) or "Expert Agents" (specialized).

General Agents: Likely to be dominated by "Frontier Model" companies (Big Tech) like OpenAI, Google, and Anthropic.
Expert Agents: Specialized intelligence for specific industries (e.g., coding, legal, medical). This is where startups like Neocognition find their "moat."
The "Moat" Strategy: To compete with Big Tech, startups must focus on "hard problems" like long-term reliability, specialized benchmarks, and deep integration into complex environments that general models cannot easily replicate.

Takeaways

Actionable Insight: Investors should look for startups focusing on Vertical AI—agents that possess "Expertise" rather than just "General Chat" capabilities.
Risk Factor: General models (GPT-4o, Claude 3.5) are rapidly "eating" the features of simple agent startups. Only those with deep, specialized data or unique "World Models" are likely to survive.

Computer Use & GUI vs. API

The transcript discusses how Agents will interact with the digital world.

GUI (Graphical User Interface): Agents "seeing" and clicking like humans. This is the "long tail" solution because most of the world's software will never be rewritten for APIs.
CLI/API: More efficient but limited to modern, tech-heavy companies.
The "Computer Use" Trend: Anthropic and Microsoft (via "MicroHard" concepts) are betting heavily on agents that can navigate any software interface.

Takeaways

Bullish Sentiment: Companies developing Vision-based Agents (those that can "see" a screen) have a higher addressable market because they can automate legacy software without needing custom integrations.

Key Companies & Projects Mentioned

The transcript identifies several major players and high-stakes "bets" in the Agent space:

Anthropic: Currently leading the "Computer Use" narrative with Claude.
OpenAI: Shifting focus toward agentic workflows and "Operator" roles.
Microsoft: Deeply interested in "Enterprise Agents" that use existing tools (Toolformer).
XAI (Elon Musk): Mentioned in the context of computer-use agents and massive compute bets.
Project Prometheus: A secretive, high-capital bet (reportedly $6-7 billion) involving Jeff Bezos, focusing on large-scale agentic automation.
NVIDIA: CEO Jensen Huang’s "AI Strategy" for every enterprise involves deploying these agents.
Neocognition: Yu Su’s startup, focused on "Expert Agents" and "World Models."

Takeaways

Capital Intensity: The "seed" rounds for top-tier agent startups are reaching $40 million+, indicating a massive "two-stage" divide where the winners are heavily capitalized from day one.
Timeline: 2025-2026 is cited as the window for "Self-Learning" and "Continuous Learning" agents to become commercially viable.

Macro Risks & Social Impact

Job Displacement: The speed of Agent development is "scary" and may outpace the creation of new job types.
Reliability & Safety: Current agents are "unskilled" in many professional tasks, leading to safety concerns if they are given autonomy in critical systems without better "World Models."
Economic Friction: AI is reducing the "friction" of complex tasks, making previously unprofitable business models suddenly viable.

Takeaways

Investment Risk: Regulatory backlash or social unrest due to rapid white-collar job displacement is a significant "tail risk" for the sector.

Ask about this postAnswers are grounded in this post's content.

Episode Description

前面在福莉和广密的节目中，我们深入揭示了AI的进化从第一幕Chat走向第二幕Agent。可以说，Agent是2026年的高频词。关于Agent，我一直非常希望深入技术原理层，给大家做一次技术讲解，能够让我们一起非常清晰地了解技术的脉络。今天我邀请的是俄亥俄州立大学计算机系教授、也是创业公司NeoCognition创始人，苏煜。苏煜是少数见证过Agent演化史的学者，研究方向是Language Agent。我们从更长周期复盘了Agent的技术演进史，尤其是最近三年Language Agent的快速进化。此外，苏煜也是2025年“斯隆研究奖”得主。接下来，是我们对Agent的技术综述。祝大家五一假期学习快乐啦^。^ OUTLINE: 00:02:00 苏煜是谁 00:03:30 Agent的技术演进史：从Logical Agent（1960-90s） → Neural Agent（2000年以后，神经代理） → Semantic Parsing（另一边的故事，语义解析） → Language Agent（语言代理） 00:27:21 人类的进化史来说，语言非常晚发生，但对人类文明有了指数型发展 00:29:28 过去三年发展速度比过去几十年都要快，复盘Language Agent上的关键工作 00:40:56 At the end of the day，大家想要的就是universal digital agent；边界的消弭和coding有关 00:45:18 我是最早从Semantic Parsing转型做Language Agent的学者之一 00:48:56 OpenClaw Moment和ChatGPT Moment有非常多相似的地方 00:55:10 中美科技辐射的pattern不同，中国更全民化，在应用层的动作更快 01:02:05 创业新公司NeoCognition，最近融完了一轮$40M的seed round 01:20:30 聊聊Continual Learning、世界模型、交互（GUI vs. CLI） 01:44:34 Agent现在最大的瓶颈是什么？对2026年Agent进展的预期？ 01:47:09 各个大厂都在Agent上bet什么，有什么有意思的bets？ 01:52:47 我们这一代人经历了Agent的完整周期，我喜欢搭建conceptional framework 02:10:13 最后的快问快答 LINKS：我们的播客在小宇宙、Apple Podcast、Spotify等全音频平台播出；我们的视频播客在Bilibili、小红书、视频号、抖音等全视频平台播出；如果你想服用文字版，请搜索我们工作室的公众号：语言即世界language is world。 DISCLAIMER: 本内容不作为投资建议。 CONTACT: xiaojunzhang@lisw.ai Jump into the new world-and explore with us!😉

About 张小珺Jùn｜商业访谈录

张小珺Jùn｜商业访谈录

By 张小珺

努力做中国最优质的科技、商业访谈。张小珺：财经作者，写作中国商业深度报道，范围包括AI、科技巨头、风险投资和知名人物，也是播客《张小珺Jùn | 商业访谈录》制作人。如果我的访谈能陪你走一段孤独的未知的路，也许有一天可以离目的地更近一点，我就很温暖：）