Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin | Sequoia Capital | Kazuha

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

Memory and Continual Learning: Engram's Dan Biderman and Jessy Lin

6 days ago•Sequoia Capital•@sequoiacapital

YouTube44 min 52 sec

Watch on YouTube

Note: AI-generated summary based on third-party content. Not financial advice. Read more.

Quick Insights

Investors should prioritize Open Source AI ecosystems, specifically Meta’s Llama, as these "white box" models are currently the only viable platforms for deploying high-efficiency continual learning architectures. Look for opportunities in startups like Engram (Private) that utilize LoRAs and Adapters to internalize data, which can reduce enterprise token costs by up to 100x compared to traditional methods. The most immediate growth is in "contextual intelligence" for legal and productivity sectors, with platforms like Notion, Microsoft, and Harvey leading the integration of personalized model memory. Monitor the hardware sector for companies solving the KV Cache bottleneck, as there is a massive efficiency premium for technologies that can compress high-bandwidth memory requirements. While OpenAI and Google focus on general reasoning, a tactical "3 to 6 month gap" exists to invest in bespoke, specialized models that outperform general AIs on specific enterprise tasks.

Detailed Analysis

The following investment insights are extracted from the discussion between Sequoia Capital and the co-founders of Engram, an AI startup focused on "continual learning" and model memory.

Engram (Private - Startup)

Engram is a "neolab" focused on solving the bottleneck of AI memory. Unlike current models that are "frozen" after training, Engram’s architecture allows models to learn continuously from new data and context.

Takeaways

Moving Beyond RAG: While most companies currently use Retrieval-Augmented Generation (RAG)—where a model "searches" a database to answer a prompt—Engram believes this is inefficient. They are betting on "internalizing" data directly into the model's weights.
Efficiency Gains: By training context into the model weights (using techniques like LoRAs or Adapters), they claim they can reduce token consumption by 100x. This eliminates the need for massive "system prompts" or rereading the same files repeatedly.
Partnerships: The company is already working with major productivity and legal platforms, including Notion, Microsoft, and Harvey, to create models that deeply understand specific company workspaces.
Open Source Advantage: While their technology can work with closed models, it is currently easiest to deploy on Open Source models (like Llama) because they require "white box" access to the model's weights to perform fine-tuning.

The "Continual Learning" Sector

The podcast highlights a shift in AI research from "raw intelligence" (general reasoning) to "contextual intelligence" (understanding specific, evolving data).

Takeaways

The "Intern" Problem: Current AI agents struggle to "get better" over time like a human intern would. The investment opportunity lies in companies that can make models "remember" feedback and previous tasks without manual prompt engineering.
Personalized Models vs. Monolithic Models: There is a growing thesis that the future will not be one giant model (like GPT-4) for everyone, but millions of small, personalized models tailored to specific individuals or teams.
Data Privacy as a Moat: Continual learning allows companies to train on private data without sending that data back to the "Frontier Labs" (OpenAI, Google, Anthropic), which is a major selling point for enterprise security.

Frontier AI Labs (OpenAI, Anthropic, Google)

The discussion touches on the strategic direction of the "Giants" in the AI space and where they might be vulnerable.

Takeaways

P0 Priority: The primary goal for frontier labs remains AGI (Artificial General Intelligence) through massive scale, more pre-training, and increased inference-time compute.
The "Product Gap": Because frontier labs focus on general capabilities, there is a "3 to 6 month gap" where bespoke, specialized models can outperform general models on specific company tasks.
Inference Costs: High inference costs are a major pain point for companies using frontier models. Technologies that compress "knowledge" into weights (like Engram’s) could disrupt the current revenue models based on high token usage.

Hardware & Infrastructure Themes

The conversation explores the technical limitations of current AI hardware, specifically regarding memory.

Takeaways

KV Cache Bottleneck: The "KV Cache" (the memory a model uses during a conversation) is described as a "monstrosity." For example, a single Wikipedia article's cache can take up 80GB of high-bandwidth memory (HBM).
Compression Opportunity: There is a massive opportunity for technologies that can compress this "brain state" into smaller, more efficient formats, potentially reducing memory requirements by 1,000x.
The "Bitter Lesson": The founders subscribe to the "Bitter Lesson" of AI—that burning more compute on new context is generally more effective than trying to engineer complex, human-like shortcuts.

Key Investment Risks Mentioned

Accuracy vs. Memory Trade-off: New architectures (like State Space Models) often trade off accuracy for memory efficiency. There is currently "no free lunch" in making models more memory-efficient without losing some performance.
Model "Drift" or Destruction: Training a model continuously on new data risks "destroying" the original model's intelligence or making it "go off the rails" if the data is low quality.
Frontier Lab Encroachment: While the founders believe they have a niche, they acknowledge that giants like OpenAI are also thinking about memory and could eventually release features that compete with specialized startups.

Ask about this postAnswers are grounded in this post's content.

Video Description

Dan Biderman and Jessy Lin, co-founders of Engram, are building a neolab around memory and continual learning, which they call two sides of the same coin. Their contrarian premise: instead of stuffing ever-larger prompts into the context window or bolting on RAG, bake a team's knowledge directly into the model's weights, so it knows your company the way an employee of several years does. The payoff: matching or beating frontier models while consuming up to 100x fewer tokens. Working with partners like Microsoft, Notion, and Harvey, the team draws on roots in computational neuroscience and state-space architectures to attack what they see as the real bottleneck in AI — not raw intelligence, but memory and continual learning. In contrast to the frontier labs' race toward one ever-bigger model and AGI, Dan and Jessy imagine a world where everyone has their own model — privately trained, always learning, and good at the things you actually care about. The real ChatGPT moment for memory, they argue, is the day your model feels like an intern that genuinely got smarter overnight. Hosted by Sonya Huang and Shaun Maguire, Sequoia Capital 00:00 Introduction 00:59 Always Training Explained 01:51 Beyond Context Windows 03:29 Ngram Product Overview 04:34 Adapters And Training Signals 05:32 Internalize Vs Externalize 06:49 Compute And Token Savings 08:19 Teams First Then Individuals 08:51 Memorization Vs Understanding 12:47 Dreams And Offline Digestion 14:08 Training Beats Curation 15:19 Why Everyone Needs A Model 21:44 Bitter Lesson And Architecture 24:44 RAG Killer And KV Cache 31:38 Future Of Memory And Models

About Sequoia Capital

Sequoia Capital

Sequoia Capital

By @sequoiacapital

Sequoia helps daring founders build legendary companies from idea to IPO and beyond. We aim to be the first true believers in tomorrow’s most consequential companies. We partner with a few outliers each year and go all-in, providing them with the hands-on help required at every stage of the company building journey. Our expertise comes from nearly 50 years of working with legendary founders like Steve Jobs, Elon Musk, Larry Page, Jan Koum, Brian Chesky, Tony Xu, Lin Qiao, Eric Yuan, Christina Cacioppo, and Patrick Collison. In aggregate, Sequoia-backed companies account for more than 30% of NASDAQ's total value. The vast majority of the money we invest has been on behalf of nonprofits and schools like the Ford Foundation, Mayo Clinic and MIT, which means most of the returns we generate benefit these great causes.