I don't think the expensive part is accurate. Voice sounds okay using the G.729 VOIP codec at 8kbps. Capturing and storing that amount of data, for every Echo user, 24 hrs a day would be trivial for company the size of Amazon.
As others have pointed out, network monitoring has shown the Echo only transmits after it's heard the wake-word, so it would appear Amazon doesn't capture everything.
Hypothetically, if I was tasked with recording everything, I'd just add an internal buffer and then ship chunks of that data along with the regular queries.
Indeed, yes, that'd be the way to do it. You could claim the post-wake-word data is recorded at a higher bit rate than it actually is, thereby accounting for the larger than necessary data-transfer.
I won't be surprised if we find out this is happening. They'll probably call it a "bug", fix it in an OTA, but then accidentally regress 6 months later.
Something you should probably understand about Amazon is that their Leadership Principles are real, and customer obsession is their main focus.
A big part of customer obsession is not eroding the trust of customers by treating their privacy as extremely important. For example, I've never heard of an incident in the more than two decades that Amazon.com has been in business of them selling customer data to a 3rd party for marketing purposes or otherwise. They wouldn't do something that could erode customer trust, or would be anti-customer, because they are in business for the long haul. Customer trust can only be earned slowly, over time, but can evaporate instantly with one mistake.
I trust Amazon more than Google and others to protect my customer data, because they've frankly earned this trust over 20+ years.
I won't be surprised if we find out this is happening. They'll probably call it a "bug", fix it in an OTA, but then accidentally regress 6 months later.
I'd be very surprised if this is happening -- Amazon knows people are watching, and it would erode trust with customers if it's found out that they are spying.
I'd be much more concerned about whether or not my phone is spying on me, even if it says it's only listening if I say "Ok Google" or "Siri". It's much harder for me to see what traffic my phone is sending across the cellular network than to snoop Amazon Echo's wifi traffic. And phone malware (sometimes baked in by the manufacturer) is increasingly common. Since the Echo does not allow user-installed Apps, at least it's less vulnerable to malware.
But you can't prove it's not happening. I'm personally skeptical that it is happening, but you saying "No for real guys, it isn't happening! I double-pinky swear" isn't actually proof of any kind either way.
As others have pointed out, network monitoring has shown the Echo only transmits after it's heard the wake-word, so it would appear Amazon doesn't capture everything.