There are many things lacking from large language models, and while some are ann...

lelandfe · on Jan 4, 2023

Give it a date and it will fully believe you. I can't imagine how Bing is going to work around this.

> Talk about the human colony that arrived on mars last month

>> The human colony that arrived on Mars last month is a significant milestone in the history of space exploration. It marks the first time that humans have established a permanent settlement on another planet.

>> The colony is made up of six people who were selected to make the journey to Mars after a rigorous selection process. These individuals were chosen for their skills and expertise in areas such as engineering, science, and medicine, as well as their physical and mental fitness to withstand the challenges of living on another planet.

6gvONxR4sf7o · on Jan 4, 2023

What’s interesting is how much of that will likely be true once humanity eventually sticks people on mars. It will be a significant milestone, it’ll be probably the first time, and people will probably be selected on those same criteria.

So this false thing is extremely reasonable, even likely. Which casts so much doubt on why it’s correct when it’s correct. Maybe from a certain lens it’s never correct, only extremely reasonable, and it just happens that extremely reasonable and correct happen to match up a lot. (and of course looking back, it’s exactly what you’d expect with next token prediction and maximum likelihood training)

yurishimo · on Jan 4, 2023

This is my primary concern as well. I just wrote another comment and even used the same sky example before reading yours.

The technology is cool, but until we can trust the answers it provides, it's just a fun toy. Nobody is seriously asking their Alexa today how to perform open heart surgery and accepting the response as gospel, but that's kinda where we are at with ChatGPT! The confidence it exudes is incompatible with learning, unless you like to be taught complete bullshit 50% of the time and then carry on with your life none the wiser.

maximus-decimus · on Jan 4, 2023

Even worse, I asked it to solve Advent of Code day 1 in Python for me... and it gave me the right answer but code that didn't work.

Being "confidently wrong" to me would mean just giving me broken code, but the facts it solves the problem by itself and then acts like the bad code it gave me did it... That's really actively deceptive.

crosen99 · on Jan 4, 2023

In the short run, I’d think it more likely and useful for it to produce sources of factual statements rather than qualify or quantify its own confidence. For one thing, a source is something the user could verify. A confidence value would itself be another assertion the user would have to accept on faith.

alkonaut · on Jan 4, 2023

I’m not sure there are any good ways of tracking sources of data through the model.

The best way would perhaps be to do it backwards: dream up 2-3 answers. Search for support for those answers after they are created by looking up individual facts in them in reference sources like Wikipedia. Then respond with the answer that could generate sources.