As I hinted at in Retiring Almost Everything, Iāve been looking into how I can use artificial intelligence. A couple months ago this would have elicited a shrug from me, but I got some assistance and thereās a prototype coming together.
Origins:
First, an engineerās eye view on what Iāve done in the past. If youāve ever looked at the Neal Rauhauser Figshare you know that I used to have a social media activity tracking practice at large scale. This involved making content searchable using Elasticsearch, then later ArangoDB. I often used Gephi for data visualization and I started playing with Graphistry about the time that business ran out of steam. The need is still there but when the Twitter API went from free to $500k/month minimum that excluded civil society organizations, and then the wave of bots began.
Disinfodrome is a document search service that uses Open Semantic Search. Iāve said much less about it but Iāve spent time with Paliscope and YOSE. I had a go at moving from OSS to using MayanEDMS, but itās less capable of the sort of free form digging thatās done with the document collections I curate. I managed to come up with a repeatable install for OSS, which has not been updated in quite a while, and thatās OK since my offerings are hidden behind Cloudflare Access.
Possible Future:
AI is an absolute madhouse, there are many new companies, products, and alliances announced every week. I got some tips from a friend and theyāve led to something that both works on the new laptop I have and looks like something that people who use Open Semantic Search could put to work.
This may not be the best solution, but itās one I can get running and show within a couple days. The basic layout is simple - the LLM is akin to a dictionary/thesaurus/grammar checker, and you use a vector database of your documents to provide knowledge to the LLM, so it doesnāt just make up stuff that sounds good.
The software challenges are:
Finding the right LLM.
Picking a workable vector database.
Vectorizing a mountain of PDFs.
Restoring ArangoDB/Elasticsearch and integrating them.
And then thereās the matter of capital cost. Iāve solved the development system problem with the new laptop but one of my elderly rackmounts must give way to a newer system. The minimum is a Dell PowerEdge R730XD like this one. These are the very first processors with AVX2 support and I have to stick to this model because I own a dozen high capacity 3.5ā NAS drives. Shifting to 2.5ā disks would cost a fortune.
And the other piece thatās needed is an Nvidia RTX 40x0 GPU. The GTX 10x0 I have now are enough for some experiments, but thatās about it. The sweet spot is $500 for a 16GB RTX 4060. The first generation RTX 30x0 look attractive ⦠until you talk to someone who has used both types. Theyāre a penny wise pound foolish sort of thing.
Conclusion:
Longtime readers are aware that I am slowly standing back up after more than a decade of illness. Getting an R730 in a rack with an RTX 4060 is going to cost around $850 - $1,000, and going from a 1U to 2U machine will take my monthly cost from $100 to $200.
And thatās not gonna happen without an assist from you guys.
As Iāve said before, those who need the conflict skills Iām attempting to transfer with this Substack are going to be irregulars, often in countries with dramatically lower median income than the U.S. If I make the good stuff paid, I would be losing the audience who need these things the most. But I really do need an assist in the form of some of you signing up as founders. If each of the eighty five of you who subscribe came up with $10 that would just cover the cost of this move.