Mostly this, though it's not so black-and-white. The paper discusses results fro... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		cbcase on Dec 19, 2014 \| parent \| context \| favorite \| on: DeepSpeech: Scaling up end-to-end speech recogniti... Mostly this, though it's not so black-and-white. The paper discusses results from a DNN-HMM system (Maas et al., using Kaldi) trained on 2k hours, and it does provide a small generalization improvement over 300 hours. Much of the excitement about deep learning -- which we see as well in DeepSpeech -- is that these models continue to improve as we provide more training data. It's not obvious a priori that results will keep getting better after thousands of hours of speech. We're exited to keep advancing that frontier.

gok on Dec 19, 2014 [–]

That was an even weirder comparison. They compare a system trained on 2000 hours of acoustic data mismatched with the testing data to their system, which was trained on 300 hours of matched data in addition to the 2000 hours of mismatched acoustic data.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact