undefined | Better HN

0 pointsseanmcdirmid9mo ago0 comments

LocalLLMs would be useful for low latency local language processing/home control, assuming they ever become fast enough where the 500ms to 1s network latency becomes a dominate factor in having a fluid conversation with a voice assistant. Right now the pauses are unbearable for anything but one way commands (Siri, do something! - 3 seconds later it starts doing the thing...that works but it wouldn't work if Siri needed to ask follow up questions). This is even more important if we consider low latency gaming situations.

Mobile applications are also relevant. An LLM in your car could be used for local intelligence. I'm pretty sure self driving cars use some about of local AI already (although obviously not LLM, and I don't really know how much of their processing is local vs done on a server somewhere).

If models stop advancing at a fast clip, hardware will eventually become fast and cheap enough that running models locally isn't something we think about as being a non-sensical luxury, in the same way that we don't think that rendering graphics locally is a luxury even though remote rendering is possible.

0 comments

3 comments · 1 top-level

dgacmu9mo ago· 2 in thread

Network latency in most situations is not 500ms. The latency from New York California is under 70ms, and if you add in some transmission time you're still under 200ms. And that's ignoring that an NYC request will probably go only to VA (sub-15ms).

Even over LTE you're looking at under 120ms coast to coast.

seanmcdirmidOP9mo ago

You have to take any of those numbers and multiply them by two, since you have to go there and then back again.

dgacmu9mo ago

No, those were round trip times already

j / k navigate · click thread line to collapse