ChatGPT is just your average Reddit user. Even when it's wrong, it's confidently...

BbzzbB · on Jan 4, 2023

Perhaps because it is trained on Redditors and co.

_z2co · on Jan 4, 2023

I know this is a joke, but I think it's important to recognize that it's because the ChatGPT language model does not have the ability to introspect and decide how accurate its knowledge is in a given domain. No amount of training on new input data can ensure it provides accurate responses.

je42 · on Jan 5, 2023

That applies to humans as well.

circuit10 · on Jan 5, 2023

No it doesn't, humans can recognise when they don't know something, current language models usually can't (yet)

Their training objective, which is to predict the next piece of text in their training data, does not incentivise them to respond that they don't know something, as there no relation in the training data between the AI not knowing something and the correct next text being "I don't know" or similar

z3c0 · on Jan 4, 2023

I'd sure hope not. Reddit comments are a masterclass in disguising ethos and pathos as logos.

I'd expect that the boring reality is that it's trained on highly ethos/logos text (academic works) and thus always presents itself as such, even when its weights cause an invalid assertion.

robswc · on Jan 4, 2023

Reddit is exhausting. One big feedback loop. People will say anything to get good karma or avoid saying certain things to avoid being down voted. If there is even just a slight majority in the way the group thinks, it will soon become the dominate opinion.

For example, there was a voice actor that lied about being paid a pitiful sum of money for a gig. Everyone took her side initially (as one should _if_ it were true) but the people saying "well, this just seems odd" were being more or less attacked and told their opinions were awful.

The quality of discussions I have on HN and niche forums are 100x better than reddit.

colejohnson66 · on Jan 5, 2023

TBF, the same can happen here to a lesser extent. False or misleading stories blow up quickly because “$BIGCORP bad”.

BbzzbB · on Jan 5, 2023

It's trained on Twitter data so I assume Reddit data as well.

Honestly feels like they're both pretty important datasets to ingest if trying to build a model on human speech, I reckon social medias, comment sections and co have the most natural human conversational text online.