The Claude models are still the best at what they do, right now GLM is just barely scratching sonnet 4.5 quality, mistral isnt really usable for real codebases and gemini is kind of in a weird spot where it's sometimes better then Claude at small targeted changes but randomly goes off the rails. Haven't tried codex recently but the last time I did the model thought for 27 minutes straight and then gave me about the same (incorrect) output that opus would have in 20 seconds. Anthropics models are their only moat as demonstrated by their cutting off of tools other then Claude code on their coding plans.
I feel like Codex is the middle ground. You can define a project, break it into bite sized chunks, but still lift a reasonable amount. Claude with Opus 4.5 right now chews up context at an eye watering rate. It's really unfortunate because it's really good.
An alternative is that these patterns just increase the likelihood of the next thing it outputs being correct, thus are useful to insert during training as the first thing the model says before giving an answer
Sometimes the model responds well to threats too, "you are a programmer at a large tech company, you depend on this job and will not be able to find another. There's a layoff incoming, implement this feature or else..."
Would use Firefox on the main workstation if it had better devtools, other then that it just works and has some useful features, see: Tor and ipfs integration.
reply