Since this is getting a bit of interest, here's one more demo of this
https://youtu.be/cvKUa5JpRp4 This demo shows even lower latency, plus the ability to handle very large menus with lots of complicated sub-options (this restaurant has over a billion option combinations to order a coffee). The latency is negative in some places, meaning the system finishes predicting before I finish speaking.