For starters, the integrated microphone in devices like Humane's pin don't face the user. Recording things without the user knowing would be difficult, if you could even manage a good recording. Then you have to transmit the data, which would almost certainly add another layer of compression. Even
if you manage to get a lossless file back on the server-side, it's going to be noisy and imperfect data that wouldn't make for clear training material. Unless you're training a denoiser (at which point there are
much better approaches), that sort of noisy data probably isn't good for much. Nevermind the cost of human-assisted labeling...
I Am Not Georgi Gerganov, I cannot denounce entire AI concepts with a single refutation. But I think logically, stealing that data would be kinda pointless. Not to say it's impossible, but at-scale I'm not sure why you'd implement it.