The lack of local LLM compute resources has been stressing me out. Last weekend I was not in a place to write any software, so I stuck my head down the old hardware rabbit hole, and came up with a prize.
Attention Conservation Notice:
Compulsive hardware tuner mumbles about esoteric aspects of museum grade equipment.
Happiness:
The 2012 vintage HP Z420 workstation that was my desktop from 2019 - 2024 is really showing its age, that whole “not wanting to get up in the morning” stuff that precedes “never gonna run again” is why it got set aside eighteen months ago. The other retirement reason had to do with that Xeon E5-2695v2 - it lacks the AVX2 instruction set that’s a minimum to run LMStudio.
I put ChatGPT on the problem and found that while it won’t run LMStudio, it’s perfectly happy running Ollama. The only drives I had were the set in Opti, but Promox is really forgiving. I moved them into the Z420 and the OS just booted right up as if nothing happened. Again, with ChatGPT handy, I got a working Nvidia CUDA setup in about 90 minutes, most of which was waiting for downloads and installs. This would have taken me a whole weekend without AI assistance.
So I’m back to having a pretty potent Proxmox setup - a dozen cores and 128GB of ram. It’s only got 1TB of mirrored SATA storage until I figure out cabling to move the pair of 5TB drives inside the box, but that’s a want, not a need. I don’t have all that much virtualization left, having shut Disinfodrome down nine months ago and just using a 16GB Mac for a year and a half.
And here’s the really happy part - this nine year old Nvidia GTX 1060 is roughly double the speed of the M1 MacAir parked next to it.
Future:
The above is a really nice find. I still have some hardware WANTS, but I do not have any hardware NEEDS at this time. This equipment will carry me through the startup’s funding process, assuming it stays running.
The workstation’s replacement would be an HP Z4 G4 with a Xeon W-2145. That replaces the out of date AVX instruction set with the much faster AVX512. The advanced vector extensions provide a 30% boost in performance for software that can take advantage of them. I think this would be about $350.
There are several candidates to replace the GTX 1060. The simplest path is an Nvidia RTX 5060Ti - a 16GB card for $430. There are some old datacenter cards that are enticing for me - the 16GB Telsa M60 is actually a pair of 8GB GPUs meant for VDI - virtual desktops. Performance wise they appear to be a pretty close match to the GTX 1060 and the price is amazing - around $40 on Ebay. Since I’m running Proxmox I could use the two slices to create two virtual machines running different models. Given the “free, basically” pricing I’ll bring one of these in just as soon as November rent is covered.
While it will be nice to have funding (cross fingers) and some hardware made in this decade, there’s a lot of value in going at things with the oldest junk you can get that still runs. When the time comes to scale things up, I will already have been hands on, squeezing the most capability out of things.
Conclusion:
October has been a month of narrow escapes. It would appear that November will be the month of no escapes. It’s nice to be going into it with three fairly capable machines (Pinky, Brain, Opti reincarnated) that offer performance in the twenty to forty tokens per second range.
I do still need the additional subscriptions I mentioned in Complete Commercial Collapse, but this workstation necromancy has headed off the capital costs. There’s no maneuver warfare here, these battles are going to be won on a trench by trench basis, just gotta slog along, no matter what resistance the world offers.