we wanted to experiment on how feasible it is to build a cafe ordering system using chat and voice with 100% local models.
we were able to build this demo using llama 8B for the llm and whisper for tts/stt. it's deployed using kubernetes and could be used a base for any AI enabled application.