I feel his need to be right distracts from the fact that he is. It’s interesting...

justcallmejm · 2025-08-10T07:29:15 1754810955

Aloe's neurosymbolic system just beat OpenAI's deep research score on the GAIA benchmark by 20 points. While Gary is full of bluster, he does know a few things about the limitations of LLMs. :) (aloe.inc)

nojvek · 2025-08-10T16:17:31 1754842651

Yeah there was on old paper that blew math/physics benchmarks out of the water by letting the LLM write code and having the physics engine execute it. I don't have a link to it off my head but that seems to be the right directly.

LLM + general tool use seems to be quite effective.