> As for whether or not they're telling the truth, I don't know;
Luckily it's possible to check [0]! Although it gets a bit more complicated and can change, my understanding is that currently most people observe it increase it's network usage after it's trigger phrase, but not at other times (it uses the network for other stuff too, but audio data is typically rather large in comparison).
The pessimist in me think that a determined actor could simply capture non-trigger voice data offline, and bundle it with the rest of the traffic whenever the next trigger word occurs. But I am talking out my ass and have in no way verified any of this
If data is being buffered and only sent after the trigger words wouldn't the data transmitted vary depending on how much was said before the trigger word?
Maybe. All uploads could be padded with the maximum buffer size so you can't tell the difference. The buffer could flush only small amounts at a time. Some compression algorithm could be used that becomes more efficient with larger recordings.
What you should be asking with any "smart" device is "can I prove this device will do no harm to me".
Honestly I have never understood the value proposition of any smart device. Why would I want any of that functionality? Never once in my life have I ever wanted to talk to my TV. I'm beginning to (again) question the wisdom of carrying a smartphone.
Luckily it's possible to check [0]! Although it gets a bit more complicated and can change, my understanding is that currently most people observe it increase it's network usage after it's trigger phrase, but not at other times (it uses the network for other stuff too, but audio data is typically rather large in comparison).
[0] https://www.iot-tests.org/2017/06/careless-whisper-does-amaz...
[1] 10.1007/s00779-018-1174-x <- Might want to use sci-hub