Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I signed up for OpenAI's ChatGPT tool, and entered a query, like 'What does the notation 1e100 mean?' (just to try it out). And then when displaying the output it would start outputting the reply in a slow way, like, it was dripfeeded to me, and I was like: 'what? surely this could be faster?'

Maybe I'm missing something crucial here, but why does it dripfeed answers like this? Does it have to think really hard about the meaning of 1e100? Why can't it just spit it out instantly without such a delay/drip, like with the near-instant Wolfram Alpha?



Under the hood, GPT works by predicting the next token when provided with an input sequence of words. At each step a single word is generated taking into consideration all the previous words.

https://ai.stackexchange.com/questions/38923/why-does-chatgp...


The non-technical way to think about it is that ChatGPT “thinks out loud” and can only “think out loud”.

Future products would be able to hide some of that, but for now, that’s what the ChatGPT / Bing Assistant product does.


You can but it’ll take longer. So one way to get faster answers is to stream the response as it is generated. And in GPT-based apps the response is generated token by token (~4chars), hence what you’re seeing.


Because it needs to do billions of arithmetic operations to generate a reply. Replying to questions is not an easy task.


Its a result of how these transformer models work. It's pretty quick for the amount of work it does, but it's not looking up anything, it's generating it a token a time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: