Vs properly enunciating "Kah-Pee f-i-l-e-1 to f-i-l-e-2"
enunciating: 'copy snake-geary-street-financial-report snake-divisadero-street-financial-report'
versus typing: 'cp gearyStreetFinancialReport divisaderoStreetFinancialReport'
If you're trying to exactly replicate something designed (and named) for text input, you're absolutely right, but I thought we were talking about hypothetical designed-for-voice systems.
'cp g-[TAB] divisaderoStreetFinancialReport'
I'd expect that to be an advantage of voice stuff; that you can go fast in new kinds of large scope contexts, maybe even whole-machine context. A system designed from the ground up could exploit that in interesting ways.