I suspect this AI <-> Human engagement style will evolve over time to become quite unlike human to human engagement, probably mixing speech with short tones for standard responses like "understood", "will do", "standing by" or "need more input". In the future these old-time demo videos where an AI is forced to do a creepy caricature of an awkward, inauthentic human will be embarrassingly retro-cringe. "Okay, let's do it!"
It's a very impressive gimmick, but I really think most people don't want to interact with computers that way. Since Apple pulled that "feature" after a few years, it's probably not just a nerd thing.
guess it's just biased with average Californian behavior and speech patterns
The benchmark for human-computer interaction should be "tea, earl gray, hot", not awkward and pointless smalltalk.