Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have been using GPT 4 daily for coding for probably six months, immediately started using Claude Opus when it came out. Opus is ahead of GPT4. There are times when I test both, but I am almost always solely using Opus. GPT4 laziness is still a huge issue, it seems like Anthropic specifically trained Opus not to be lazy.

The UX of GPT4 is better, you can cancel chats/edit old chats, etc. But the raw model is behind. You have to expect that OpenAI is working on something big, and is not afraid of lagging behind Anthropic for a while.



what do you mean by "laziness"?


Speaking from my own experience, which may be different from the grandparent comment: I’ll ask ChatGPT (on GPT4) for some analysis or factual type lookup, and I’ll get back a kinda generic answer that doesn’t answer the question. If I then prompt it again, aka a “please look it up” type message, the next reply will have the results I would have initially expected.

It makes me wonder if OpenAI has been tuning it to not do web queries below some certain threshold of “likely to help improve reply.”

I’d say ChatGPT’s replies have also gotten slowly worse with each passing month. I suspect as they try to tune it for bad outcomes, they’re inadvertently also chopping out the high points.


I think OpenAI did a cost optimization because they were spending too much on compute. And so the laziness is by design.


Yep. Also since there is that shift for-profit-mode.


Try this on either your favorite GPT or favorite kid learning stats..

"What are the actuarial odds of an American male born June 14, 1946 in NYC dying between March 17, 2024 and US Election day 2024?"


It’s a common phenomenon. Been in the news quite a bit. Here from Ars https://arstechnica.com/information-technology/2023/12/is-ch...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: