Hard disagree. The essay is a rehash of Reddit complaints, no direct results from testing and largely about product launch (simultaneous launch to 500mm+ users mind you) snafus. Please.
I think most hit pieces like this miss what is actually important about the 5 launch - it’s the first product launch in the space. We are moving on from model improvements to a concept of what a full product might look like. The things that matter about 5 are not thinking strength, although it is moderately better than o3 in my tests, which is roughly what the benchmarks say.
What’s important is that it’s faster, that it’s integrated, that it’s set up to provide incremental improvements (to say multimodal interaction, image generation and so on) without needing the branding of a new model, and I think the very largest improvement is its ability to retain context and goals over a very long set of tools uses.
Willison mentioned it’s his only daily driver now (for a largely coding based usage setup), and I would say it’s significantly better at getting a larger / longer / more context needed coding task than the prior best — Claude - or the prior best architects (o3-pro or Gemini depending). It’s also much faster than o3-pro for coding.
Anyway, saying “Reddit users who have formed parasocial relationships with 4o didn’t like this launch -> oAI is doomed” is weak analysis, and pointless.
If ChatGPT 5 lived up to the hype, literally no one would be asking for old models back. The snafus are minor as far as presentations go, but their existence completely undermines the product OpenAI is selling, which is an expert in your pocket. They showed everyone this "expert" can't even assist the creators themselves to nail such a high stakes presentation; OpenAI's embarrassing oversights foretell similar embarrassments for anyone who relies on this product for their high stakes presentation or report.
I think most hit pieces like this miss what is actually important about the 5 launch - it’s the first product launch in the space. We are moving on from model improvements to a concept of what a full product might look like. The things that matter about 5 are not thinking strength, although it is moderately better than o3 in my tests, which is roughly what the benchmarks say.
What’s important is that it’s faster, that it’s integrated, that it’s set up to provide incremental improvements (to say multimodal interaction, image generation and so on) without needing the branding of a new model, and I think the very largest improvement is its ability to retain context and goals over a very long set of tools uses.
Willison mentioned it’s his only daily driver now (for a largely coding based usage setup), and I would say it’s significantly better at getting a larger / longer / more context needed coding task than the prior best — Claude - or the prior best architects (o3-pro or Gemini depending). It’s also much faster than o3-pro for coding.
Anyway, saying “Reddit users who have formed parasocial relationships with 4o didn’t like this launch -> oAI is doomed” is weak analysis, and pointless.