undefined | Better HN

0 pointsthot_experiment1y ago0 comments

I guess it depends on expectations, if your expectation is an CRUD app that opens in 5 seconds, then sure, it's definitely tedious. People do install things though, the companion app for DJI action cameras is 700mb (which is an abomination, but still). Modern games are > 100gb on the high side, downloading 8-16gb of tensors one time is nbd. You mentioned that there are 663 different models of dsr1-7b on huggingface, sure, but if you want that model on ollama it's just `ollama run deepseek-r1`

As a developer the amount of effort I'm likely to spend on the infra side of getting the model onto the user's computer and getting it running is now FAR FAR below the amount of time I'll spend developing the app itself or getting together a dataset to tune the model I want etc. Inference is solved enough. "getting the correct small enough model" is something that I would spend the day or two thinking about/testing when building something regardless. It's not hard to check how much VRAM someone has and get the right model, the decision tree for that will have like 4 branches. It's just so little effort compared to everything else you're going to have to do to deliver something of value to someone. Especially in the set of users that have a good reason to run locally.

0 comments

No comments yet.