All these speech assistants are almost stateless, aren't they? I mean, they can ...

Joeri · on May 23, 2020

When i made a meeting room reservation chatbot the hardest part was managing a conversational context without the user getting stuck.

In the end I had a “draft booking” and a conversation loop, where the bot would repeatedly ask to fill in missing parts (eg nr of participants) and then give you a summary and opportunity to correct things. It was hard to do, and definitely required a lot of contextual understanding of how people book meeting rooms. That approach doesn’t scale up well.

I think the basic problem is being stuck in a local optimum. The scripted bot approach doesn’t scale to complex conversations, and you need to start from scratch to do better.

j-pb · on May 23, 2020

Ahh the good old "conversation is a state machine" pitfall. Even linguists I work with do that sometimes, I guess it's how the simple models that we're taught with work.

Wanna have the simplest parser? Finite State Automaton to the rescue! So people automatically assume that a the simplest approach yo conversation is also something like a finite state machine.

Here's the thing. The only reasonable FSA would be a clique.

You can always move between nodes.

A much more feasible approach is the "actions competing for repelevance" one. Where you have global state manipulated by actions, and all the actions generate a "appiccability score" for the given user input. The system then chooses the most appropriate action, and it does it's thing. And on the next user input the cycle repeats.

cmckn · on May 23, 2020

Honestly, I can forgive the lack of context awareness. That's a hard problem. I have issues even getting consistent responses to the same query over time (even back to back in some cases). Sometimes, Siri will misunderstand me and fail to do the thing, but then I look at the text that's transcribed...and it's correct (I.e. the backend was replying to a different transcription than I saw on the frontend).

I've just been trained to not bother. Unless I'm setting a timer, I just don't try anymore.

giancarlostoro · on May 22, 2020

> you can't have a conversation about your conversation with them and improve their comprehension of anything. It's like talking to a command line, or those old Palm Pilots you had to learn a special alphabet to scribble on.

Which is why I have no confidence calling it AI if its not even intelligent. Its just voice recognition on preprogrammed operations.

ratsmack · on May 22, 2020

>Which is why I have no confidence calling it AI...

That's because it really should be called Simulated Intelligence, and would be a much more accurate description. The marketing team wouldn't like this though.

ajmurmann · on May 22, 2020

Are they even simulating intelligence. To the parent poster's point it's really just simulated voice recognition. No intelligence even gets simulated. This is more like running cucumber scripts based on voice recognition.

swinglock · on May 22, 2020

It's an artificial idiot, plain and simple.

ferzul · on May 23, 2020

your average idiot is intelligent. your average digita assistant can barely assist and has no fingers.

caycep · on May 22, 2020

yeah, that's what I'm wondering about - is it going to be hard to implement things like this until LSTMs with better "memory" get good enough for consumer use?

ashildr · on May 23, 2020

You can‘t even have the „thank you“ „you‘re welcome“ part of the dialogue.