Not to try to do anything predictive, but just to get words right when it would be clear to any human what the intended word would be in context, both gramatically as well as in subject matter.
I have to assume you could do this pretty well with a vastly smaller model that would run on an iPhone.
I mean, dictation on my iPhone is vastly better than it was 10 years ago -- it's usable for a lot of stuff that it simply wasn't usable for previously (dictating brainstorming ideas while lying on the couch, for example). But it still makes a lot of mistakes and just skips far too many words it can't seem to figure out.