The weirdness of LLMs is that they're so damn good at so many things but then yo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		timbilt 11 months ago \| parent \| context \| favorite \| on: Effective AI code suggestions: less is more The weirdness of LLMs is that they're so damn good at so many things but then you see these glaring gaps that instantly make them seem dumb. We desperately need benchmarks and evals that test these kinds of hard to pin down cognitive abilities

turnsout 11 months ago [–]

Absolutely. This is not a new observation, but another thing they struggle with is self-reporting confidence intervals. When I've asked LLMs to classify/tag things along with a confidence metric, the number seems random and has no connection to the quality or difficulty of the classification.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact