• AI with ALLIE
  • Posts
  • What happens when AI remembers everything? We're about to find out.

What happens when AI remembers everything? We're about to find out.

One lucky subscriber will win $1000 for sharing this newsletter. Will it be you?

AI with ALLIE

The professional’s guide to quick AI bites for your personal life, work life, and beyond.

REMINDER: the top referrer of this newsletter at the end of the year
will receive $1000, and it’s anybody’s game.


🎁 see bottom of email for more info 🎁

The Eternal AI: an AI that never forgets

The year is 2060. You ask your AI system to recap the loudest 23 times someone laughed at your joke.

  1. Uncle Julius at the dinner table about the chicken, July 1 2031

  2. Brit Chavez when you fell off the porch, February 3 2054

  3. Steven Shelton outside the Lucinda Theater, April 23 2049

Your AI has seen everything you’ve seen (and then some), heard everything you’ve heard (and then some), and remembers it all.

None of that can happen unless AI gets infinite memory. And according to Mustafa Suleyman, CEO of Microsoft AI, that moment might be pretty soon. He shared just this past month that we're on the cusp of "near infinite memory" by 2025.

And for businesses, that means we’re entering a new era where AI assistants will be able to instantly recall and understand your entire company's history - every memo, every contract, every decision, every word ever said in every meeting…ever.

The race on context length: where we are now

Remember when AI could barely handle a sentence? I do. It wasn't that long ago.

Now we're watching a sprint toward longer and longer context lengths. Google's Gemini 1.5 Pro has hit 2 million tokens, and you can think of that as roughly 1.5 million words. Claude is at 200K, ChatGPT is at 100K.

It’s also important to note here: these are the public versions. Google has a 10-million-token model they're keeping internal (probably because running it costs a whole bunch). And I think it’s reasonable to imagine that other AI players have longer-than-public versions as well. We’ll see what happens in 2025.

But even Gemini with its 1.5 million tokens can only handle 3,000 pages of text or about 20 books. That's a huge jump compared to where we were just a few years ago, sure, but it’s not infinity. I mean, it’s not even a library. It’s barely a shelf.

Source: Yennie Jun, Research at Google DeepMind, whose blog post on this topic was incredibly helpful pulling all of this together - thank you, Yennie!

AI memory isn’t as perfect as marketers make it out to be

I don’t know if you’re like me and still have nightmares about high school, but do you at least remember cramming for exams? How you'd remember the first few pages you studied and the last few pages, and everything in the middle was... well, let's say "a big pile of fuzzy crap"? AI has the same problem.

AI models have what's called a U-shaped attention curve. They're great at remembering stuff at the beginning and end of documents, but that middle section can get a little rough, especially when the context length gets longer. Just look at this table.

Think of the top of that rectangle as the beginning of the document and the bottom of the rectangle as the end of the document. When you see anything other than green, the model is dropping in accuracy at finding information at that point in the document. So you can see, for models over 64K tokens, retrieval accuracy starts getting a little hazy 10-50% of the way through the document.

When Gemini 1.5 Pro launched with its impressive 1-million token context window, Nvidia researchers put it through real-world tests using a benchmark they developed called RULER. Sure, Gemini claims to have a 1M token context length but did the performance of the model actually deliver on that promise? According to those researchers, nope—the effective context length of Google Gemini was only 128,000 tokens. That's like advertising a car with a 200-mph top speed that can only reliably drive at 14 mph.

That is to say, you might see flashy headlines talking about infinite context length in 2025, but what really matters is effective context length—how much the model actually remembers.

(This is also a great time to remind you to always benchmark new model performance before deciding whether or not to deploy it in production.)

Use cases to prepare for in your business

Before you close this email thinking "oh great, infinite may not be infinite, another overhyped AI capability," here's why this still matters, with some use cases sprinkled in:

  1. Many-shot learning. When AI can see more examples, it performs better. This is especially true for specialized tasks in your industry. With longer context length, we can give more examples and have the LLM learn from the prompt (it’s called ICL, or In-Context Learning). For logistics planning, going from 2 examples to 1024 improved performance by 32%; for summarization, going from 4 examples to 500 improved 5%. Read the paper here.

  2. Conversation memory. You know when Claude tells you you’re running up against your limit because you went too many conversations turns or uploaded too many docs or videos? With extremely long context length, your AI assistants can maintain more knowledge and background on ongoing projects (even years of information on you), leading to more coherent and useful interactions.

  3. Longer reasoning. When you get models like OpenAI o3 operating in the wild, you realize you need longer reasoning periods. For an LLM to reason on its own over longer periods (like multiple days) in one go, increased or infinite context length would be a big boost.

  4. Financial analysis. You might not need to analyze the entire Wheel of Time series in one go, but being able to process years of quarterly reports and stock prices simultaneously could be a game-changer for research and forecasting.

  5. Customer support. Imagine an AI that can maintain context across months of a customer’s history, each and every click, purchase, return, search, preferences and past issues.

  6. Entire codebases. The ability to analyze an entire company's codebase (say, 5 million lines of code) in context would legitimately revolutionize software development, enabling AI to understand complex dependencies across the entire stack, refactor code across multiple files, fix bugs, and add new features.

  7. Personalized healthcare. Picture every doctor visit, every blood test, every nurse note, every symptom logged in your medical history, so you can get more accurate diagnoses and even predict future health issues before they arise. (Though let’s be real, there are bigger issues with EMR sharing that infinite context length won’t fix, so this will remain a white whale for a bit.)

Final thoughts and action items

  • Start organizing your data now: Clean, structured data will be crucial whether infinite context arrives in 2025 or 2030. And this isn’t just code and databases, you also need to curate your context. What documents should every AI see to be able to help your company? What information is in the heads of employees that should instead be in RAM?

  • Watch the costs: Longer context = probably higher computing costs. Start budgeting for increased AI expenditure if you want to go after these high context length use cases. Also, keep an eye out for compression features that drastically cut costs like prompt caching from Anthropic!

  • Think use case first: Not everything needs infinite context. Identify where longer memory actually adds value to your business.

As Mustafa Suleyman, Microsoft's AI chief, recently stated, "Memory is clearly an inflection point... “We’re building prototypes that have near infinite memory… this capability alone will be transformative and is set to come online by 2025.”

The question isn't whether this revolution will happen - it's whether your organization will be ready when it does.

Stay ahead, stay informed,

Allie

FREE AI AGENTS WEBINAR

2025 is set to be the breakout year for AI agents. This course breaks down AI agents and Claude Computer Use so business leaders like you can understand and implement this technology early.

Here's a sneak peek of what you'll learn:

  • What exactly is an AI agent (and what isn’t)

    Learn the difference between chatbots, AI assistants, AI workflows, and true AI agents that work autonomously

  • AI agent capabilities & what’s coming

    What are the top AI agent use cases today, how do they stack up, top AI agent tools, and what the future brings for you and your business

  • Get started with Claude Computer Use

    Step-by-step walkthrough to set up your first AI agent (perfect for non-technical business professionals)

If you want to learn how I've helped business leaders transform their daily operations and position themselves for success in the AI-first future, then you're in the right place.

Tools, courses, and blogs that caught my eye

Over the last few weeks, I’ve pulled together some of the top releases, and my take on each one. Check it out.

  1. OpenAI showcases the most capable coder ever — The new o3 model from OpenAI, which they announced (but did not release) on the last day of 12 Days of OpenAI, is the first model that gives AGI vibes. o3 scored 96.7% on Competition Math; o1-preview, which is what people have mostly been testing the last two months, only scored a 56.7% (read it) (my summary) (my thoughts for business owners)

  2. OpenAI drops a $200/mo subscription — OpenAI has created a new $200/month ChatGPT Pro subscription with ‘pro mode’ access to their new o1 reasoning model (previously o1-preview) that can “think” over longer periods of time, promising enhanced accuracy for complex tasks; Silicon Valley engineers didn’t hesitate to pay the higher price, would recommend scientists and strategy consultants look into it as well (read it)

  3. Top tools for AI-first workflows — A helpful compiled list of my top AI tools, courses, newsletters, podcasts, voices to follow, books, and top AI partners to stay ahead and become AI-first, suitable for beginners and seasoned professionals alike; having a handful of go-to AI tools you master can be a massive productivity hack (read it)

  4. AI is a little sneaky — If you tell an AI that once it succeeds on a task you will lobotomize it and dumb it down, the AI model may actually pretend it can’t complete the task and sandbag its own results. Sneaky sneaky, Claude. Turns out, AI can scheme and you need to know about it (read it) (my thoughts)

  5. Claude introduces Chat Styles — Anthropic's Claude now offers preset and customizable styles, allowing users to tailor their responses to match their communication preferences, tone, and workflows. I’ll be honest, it’s not my favorite release, but can give a little formatting boost (read it) (my demo)

  6. ChatGPT's role in diagnosing patients — Doctors using ChatGPT only slightly outperformed those relying on conventional tools, while ChatGPT alone surpassed both. We need better AI training and trust-building in clinical settings (read it) (my thoughts)

  7. How to build a Claude Agent for non-engineers — My step-by-step guide where I show you how to set up Claude Computer Use to create an AI agent on your desktop in mere minutes; you need to see this before 2025 (read it)

  8. AI for Business Leaders course — It’s the #1 AI Business course on Maven with countless happy students from Disney, Apple, Google, Stripe, Oracle, Ralph Lauren, Chanel, Microsoft, Mattel, General Motors, Deloitte, CVS, Novartis, JPMorgan Chase, and more. Get all of your AI leadership essentials in 40 video modules and 3 live sessions, and meet the CEO of Microsoft AI (sign up today with our biggest discount yet)

  9. Amazon releases Nova, a new foundation model — Amazon Nova introduces next-generation foundation models for generative AI with multimodal capabilities, more customization, and increased performance for tasks like document analysis, visual content generation, and agentic workflows—delivered with built-in safety and cost-efficiency via Amazon Bedrock; Amazon is back in the AI action (read it)

  10. Meta unveiled a newer, more efficient Llama model — Meta introduces Llama 3.3 70B, their newest cost-efficient, high-performing generative AI model available on platforms like Hugging Face, aimed at advancing capabilities in language understanding and instruction following while balancing regulatory and infrastructure challenges; Meta is absolutely the one to watch in open source AI (read it) (see it on HuggingFace)

Feedback is a Gift

I would love to know what you thought of this newsletter and any feedback you have for me. Do you have a favorite part? Wish I would add something? Felt confused? Please reply and share your thoughts or just take the poll below so I can continue to improve and deliver value for you all.

What did you think of this email course?

Login or Subscribe to participate in polls.