Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I can't believe that so many people, especially the ones on tech sites like this, bought into the idea that speech recognition is such a hard job that it needs to be run on the supercomputers of tech giants. We had somewhat decent voice dictation software on desktops 20 years ago, when 100Mhz processors and 32MB of RAM were top of the line, yet now it's impossible with an order of magnitude more resources.


20 years ago you spoke directly into a microphone and could only use an extremely limited set of supported languages and locales/accents.

You're also missing the point. I don't think anyone has ever claimed that speech recognition can only be done on supercomputers. Your laptop can surely run one of these models (though it would take a long time to train one). But there's a reason why an Echo Dot cost $20 and not $1000.


A Raspberry Pi can run Snips. I don't know how long it takes to train or how well it works.


The main reason this kind of thing is outsourced to the cloud nowadays is because of deep neural network voice recognition technologies we have. Most of these models are too hefty to run inference on-device. Also, online learning allows for STT to get better as it’s used more if it’s centralized in a place like the cloud.


My iPhone runs machine vision on my phone at night (locked, plugged in, on wifi) to determine which people are in which photos. My iPhone.


Your iPhone is also a $1k device that's faster than some laptops. And it still cant do convincing on-device text to speech a-la Google Tacotron, and its NLU capabilities _even in the cloud_ leave much to be desired.


Much of the cost of the iPhone is in the screen, battery, form factor and fashion accessory premium. Take that away and your much closer to raspberry pi territory.


It's closer to Apple TV territory maybe, that custom CPU inside costs a pretty penny to design and manufacture.


Can one not do machine learning on a RaspberryPi with an FPGA shield?


FPGA on which it is worthwhile to do deep learning costs more than the iPhone, and consumes a lot more power. Your best option starting next year will be sub-$100 Chinese chips with a TPU-like unit built in. The only one I know of is RK3399Pro, which was supposed to come out this year, but didn't make it, apparently because the die had to be larger than they planned.


That's true. I didn't realize that when I wrote my comment (from an iPhone, haha).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: