Skip to main content
May 24, 20263:51

I Finally Did It...

By Samuel Gregory

About this video

Most people are wasting their money on hardware that cannot actually handle the future of local AI. I am on the verge of purchasing an M5 Max with 128GB of RAM to find the absolute limits of local inference, but I need your feedback first. Key Takeaways: - Base models like the M5 Air are proving insufficient for serious, professional AI usage. - Testing is required to find the 'bare minimum' hardware for complex tasks. - High-end specs allow us to see exactly how much RAM large context windows and MoE models actually consume. - Community input is vital to decide which models and workflows (coding, admin, crawling) should be tested. - I am acting as the guinea pig to save you from making an expensive hardware mistake.

Your current laptop is probably a paperweight for modern local AI.

The marketing departments at big tech companies love to tell you that their latest chips are 'AI ready' because of a dedicated neural engine. The reality, however, is far more expensive and memory-intensive than most consumers realise.

I am currently standing on the edge of a significant investment: an M5 Max MacBook Pro with 128GB of unified memory. Before I take the plunge and spend a small fortune, I need to know if this is the path you want to explore with me.

The Base Model Reality

I have spent considerable time putting the M5 Air and the base MacBook Pro through their paces. While they are fantastic machines for general productivity, they hit a wall the moment you ask them to handle complex local inference or large context windows. My conclusion so far is simple: they are not quite there for a professional, local AI workflow.

The 128GB Discovery Plan

By testing a machine with 128GB of RAM, I can monitor exactly how much memory is consumed by specific tasks. This is not about bragging rights; it is about finding the 'sweet spot' for everyone else. I want to see:

  • How 200k context windows actually behave under pressure.
  • The real-world RAM usage of Mixture of Expert (MoE) models.
  • Whether coding assistants running local models can keep up with professional developers.

Use Me As Your Guinea Pig

I want to find the bare minimum hardware required by observing how much of the maximum hardware is actually utilised. Do not spend your hard-earned money until we know where the point of diminishing returns lies.

What models should I test first? Are you more interested in Qwen, Gemma, or something more obscure? Let me know what you want to see.

Transcript

Guys, I finally did it. No, I really didn't. I'm on the cusp of buying an M5 Max with 128 gig of RAM, but I need your feedback. I need your input before I invest in such a machine. Lately, the channel's been doing really well discovering what is the bare minimum hardware we need to run local AI models. I get a lot of comments on those videos with people just laughing and saying, 'Oh, I need the maximum machine and all this and all that.' I know you go to the moon and back with the amount of money that you could spend on the maximum amount of hardware that you need to run these models. Of course, that just makes so much sense. But, I think like many of my viewers and you probably yourself included, we're interested to know whether the claims that these companies are making are in fact true. What is the bare minimum hardware we need to run local AI? You know, I've got a series looking at the M5 Air. I've got a whole series looking at the M5 MacBook Pro, just the base models, and what those machines are capable of. Go watch those, but ultimately I come to the conclusion that they aren't quite there for any real AI usage. So, I need to know what are you guys doing with your local AI? Are you doing coding? Are you just chatting with it? Are you actually doing administration? Are you doing open crawl? What tools are you using? Are using open code? Are using kilo code? Are you building your own tools with the local AI or you building your own tools with say Claude and then but actually running the inference against a local model within that tool? I don't know. You tell me exactly what you guys are doing with your local AI. I also want to know what models you're running. I've discovered or talked a lot about Qwen and Gemma because these are kind of the two competing products with consumer level hardware local AI models. What models do you want to see? And how can I... and are they the small models, the 9 billion parameter models, the larger mixture of expert models or the new Gemma dense model that they brought out? And ultimately, what do you want to see? Because buying such a big machine like that will just give me a lot of headroom. But in the spirit of this channel and what has become recently, we're really interested in what the bare minimum hardware is. Now, I could have 120 gig, but it doesn't mean I should push that because I want to know what is the bare minimum. Like, you know, if I run a larger model with 100,000 context or 200,000 context or whatever, what is the RAM looking like? Even though I've got 128 gigs theoretically, is it pushing 60? Is it pushing 70? That will give you the idea of what Mac that you can buy or what the bare minimum hardware is and what performance you can expect from that. So again, what do you want to see if I was to invest in such a machine? Because at the end of the day, it's for you. It's for us. It's for discovery of what we can do and what the landscape is of local AI. So I'm sorry for the rug pull. I don't have the machine yet, but if I get enough interest and I get enough curiosity and clear outline about what it is that you want to see with such a machine... if there's enough content that I could theoretically produce, I'm more than happy to go out and get that machine. I'm more than happy to make the exact content that you want to see. So yeah, let me know down in the comments. Give it a like. Share it with some people who are interested in this sort of stuff. Use me as the guinea pig that I am before you go out and spend your hard-earned money. So, like, subscribe if you haven't already. Thanks to my patrons who support me directly. Till next time, keep on vibing.