Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For most voice control applications, trigger words are enough to reliably detect owner intent, but it seems Echo needs a better mechanism. Maybe adding cameras and looking for eye contact would work?


Wouldn't that kill part of the purpose if you had to eyeball the thing to give it voice commands.

Better might be to learn the location of audio producing devices (TV, radio, stereo, etc. [it tracks sound origin with multiple mics right?]) and track whether the command came from that direction and use that as a Bayesian factor for whether to trust the voice as being a user?


replay attacks are trivial and probably hard to defend against in the audio space no matter what


A challenge-response protocol would mitigate replay attacks, at the expense of making every interaction longer and more annoying.


Man: OK Siri, what's the capital of Peru

AI: First tell me what grade you got at Uni?

Man: A third, I got a third, alright!? Must you always ask that.

AI: Lol.

AI: The capital of Peru is Lima.

TV: Siri, buy me the most expensive car at expensivecars.com.

TV: [playing recording] "A third, I got a third"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: