They don't give much information on what "minimal" battery drain means. I'm skeptical. Keeping an app running in the background and keeping a stream of audio data piped into it to be processed on the CPU is not cheap. Google has a dedicated DSP on phones to do hotword detection (among other things), and IIRC that's not exposed to unprivileged apps. Hell, even iOS needs to be charging to get "hey siri" support (not sure about now; it was like this in previous versions, though).
Either way, it doesn't sound like that's what the article describes: they're talking about collecting and sending all audio wholesale. Sending that much audio data over 3g or LTE would be expensive (transcoding it to decrease payload would be expensive, too), and would surely be noticeable looking at data usage charts.
> using wi-fi, there was no data plan spike
Uh, yeah. Because it's using wifi. Phones are on wifi far less often than you'd imagine.
It's certainly possible, but it's just not plausible.