The singularity is here.-
I wonder if llama.cpp with SYCL could help, will have to try it out: https://github.com/ggerganov/llama.cpp/blob/master/README-sy...
But even if that worked, I'd still have the problem that IDEs and whatever else I have open already eats most of the 32 GB of RAM my desktop PC has. Whereas if I ran a small code model on the MacBook and connected to it through my PC, it'd still probably be too slow for autocomplete, when compared to GitHub Copilot and less accurate than ChatGPT or Phind for most stuff.
Anyway, my tale of woe and recovery re the great centralized chatbot outage was turning to use MLCChat on my phone running a Mistral 7b model and being happily surprised it could actually summarize and answer questions about (though slowly) an article I didn't want to read in its entirety. Although, I didn't quite trust it, so I read the article anyway. The summary wasn't too bad, though. Good to know.
The OpenAI API meanwhile is still up :)
This is what I'm getting from Claude.
Do they make calls to each other?