Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All these speech assistants are almost stateless, aren't they? I mean, they can ask you questions and go into a listen-for-an-answer state, but, at least last time I tried, you can't have a conversation about your conversation with them and improve their comprehension of anything. It's like talking to a command line, or those old Palm Pilots you had to learn a special alphabet to scribble on.

It's quite interesting to hear very small children talk to voice assistants. Probably not surprising, but it seems like in normally-learnt human communication you expect your conversational partner to remember the context of what you were saying, and carefully forming canned commands is a separate learned skill. It suggests these voice assistants have still got a way to go, and it seems more like a paradigm change, a big leap, than just incremental improvements.



When i made a meeting room reservation chatbot the hardest part was managing a conversational context without the user getting stuck.

In the end I had a “draft booking” and a conversation loop, where the bot would repeatedly ask to fill in missing parts (eg nr of participants) and then give you a summary and opportunity to correct things. It was hard to do, and definitely required a lot of contextual understanding of how people book meeting rooms. That approach doesn’t scale up well.

I think the basic problem is being stuck in a local optimum. The scripted bot approach doesn’t scale to complex conversations, and you need to start from scratch to do better.


Ahh the good old "conversation is a state machine" pitfall. Even linguists I work with do that sometimes, I guess it's how the simple models that we're taught with work.

Wanna have the simplest parser? Finite State Automaton to the rescue! So people automatically assume that a the simplest approach yo conversation is also something like a finite state machine.

Here's the thing. The only reasonable FSA would be a clique.

You can always move between nodes.

A much more feasible approach is the "actions competing for repelevance" one. Where you have global state manipulated by actions, and all the actions generate a "appiccability score" for the given user input. The system then chooses the most appropriate action, and it does it's thing. And on the next user input the cycle repeats.


Honestly, I can forgive the lack of context awareness. That's a hard problem. I have issues even getting consistent responses to the same query over time (even back to back in some cases). Sometimes, Siri will misunderstand me and fail to do the thing, but then I look at the text that's transcribed...and it's correct (I.e. the backend was replying to a different transcription than I saw on the frontend).

I've just been trained to not bother. Unless I'm setting a timer, I just don't try anymore.


> you can't have a conversation about your conversation with them and improve their comprehension of anything. It's like talking to a command line, or those old Palm Pilots you had to learn a special alphabet to scribble on.

Which is why I have no confidence calling it AI if its not even intelligent. Its just voice recognition on preprogrammed operations.


>Which is why I have no confidence calling it AI...

That's because it really should be called Simulated Intelligence, and would be a much more accurate description. The marketing team wouldn't like this though.


Are they even simulating intelligence. To the parent poster's point it's really just simulated voice recognition. No intelligence even gets simulated. This is more like running cucumber scripts based on voice recognition.


It's an artificial idiot, plain and simple.


your average idiot is intelligent. your average digita assistant can barely assist and has no fingers.


yeah, that's what I'm wondering about - is it going to be hard to implement things like this until LSTMs with better "memory" get good enough for consumer use?


You can‘t even have the „thank you“ „you‘re welcome“ part of the dialogue.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: