Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, probably the only way. I'm catching up on Neural Nets to try and do the same. Now just need Stephen Fry to lend me his voice.


Do you really need AI though? I am using a simple branching tree structure for commands and queries I know I want, and since it's for my use I already know those commands, and they tend to match my conversational style to begin with.

For the purposes of outside knowledge queries you might not be able to come up with in advance, there's good cause to outsource those rare requests out to the Internet: Just do it intelligently. Require a prefix instruction for an outside request.

For instance, I went ahead and implement Wolfram's API for knowledge queries. They have a great "spoken answer" endpoint, which replies with a string meant to be piped straight to speech output. So I "ask wolfram how tall abraham lincoln was", my program hands everything AFTER "ask wolfram" to the Wolfram API, and Wolfram's API gives me a string back with exactly what I asked.

Now sure, I'm not entirely offline at that point, but everything regarding my personal data, home automation devices, etc. is under my control, and any time I reach out, it's specifically using a command authorizing it to do so.

Of course, caveats before you think my project sounds impressive: A. It's written in Visual Basic. B. Speech recognition isn't working (yet).


If you are interested in speech recognition, Mozilla just released a big open corpus of speech data and some trained models.

Visual Basic is a fine programming language with very weird syntax. Hell, Microsoft has put out .NET bindings for CNTK so you could even use that.


Old Microsoft Speech API would be a good fit here. I miss it. Back in 2007 I made myself a voice control interface for changing music playback. Trainable, completely off-line. Worked like a charm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: