If you watch tech news then you have probably seen an influx of videos reviewing things like the Lenovo ThinkStation PGX, a $4,100 (or more) mini PC that houses an Nvidia ARM CPU and GPU. They and their Asus, Nvidia, and other counterparts are designed to be servers really, to host models locally to do anything really: make your own Alexa, automate your house, or even... code.
On the Apple side, it's likely we will see a Mac Studio launch with an M5 Ultra option very soon. They have already allowed you to order machines with a good amount of the unified memory, so that isn't likely to change, but the GPU is. If the rumors are correct then it should be competitive with the DGX, PGX, etc class machines that were mentioned before. So Apple might find itself suddenly in the AI race after all, just not with their software, they are behind there, but in the professional and prosumer AI hardware market. The size of that market isn't exactly known, for reasons that might become obvious to you later in this post, but that makes Apple stock a strong buy for me. RAM prices are plain old broken right now, but Apple capitalizes on being able to charge ridiculous prices for their memory already, so they might get a nice revenue bump.
But what's interesting is that they aren't really for FP16 or FP32 models. It can run them... fine, but really only for 1 user doing 1 thing at a time. They are really for FP8 GGUF, GPTQ, etc type models. It is plenty of power to do something like code, but they do suffer from 5-10% accuracy penalties when shrunk down like they are, at least for now. So it could mean companies are planning on improving the 70b class models that you'd most commonly see running on these. It could mean they want to make a clear separation and a larger market for that slice too. It's hard to say, but it's all interesting to watch.
The influx of reviews probably means 1 or 2 things (or a bit of both): The companies releasing these 2nd gen devices are putting advertising money and effort into it, sending test units out to reviewers. And the prices are getting down to a point it makes sense for content creators to show off these new toys to the prosumer market.
Platforms like OpenAI are already integrating ads, they really need to in order to reach any level of sustainability. That means we are marching towards the enshitification of AI already, which will likely start to make these new class of PCs enticing for tech enthusiasts with some money to spend. The frustration will likely lead to continued open source AI tooling, training, and model improvements, feeding that demand even more.
That is all to say, my prediction for 2026 (but probably more likely 2027) will be the rise of the these new devices and FP8 class models. We should be seeing lots of improvements in those models soon... unless AI takes all of our jobs and there's no one left to be a customer. That could happen too, maybe.