OpenAI have openly stated that o1 & o3 are using test time compute, and released...

OpenAI have openly stated that o1 & o3 are using test time compute, and released a log scale graph indicating linear performance gains for exponential compute usage.

https://openai.com/index/learning-to-reason-with-llms/

They only confirm that the model/system is doing chain of thought, but the exponential factor and origin of reasoning gains likely comes from TREE of thoughts (number of branches/compute goes up exponentially with depth), essentially doing tree search over different reasoning chains.

I assume roon's identity is well known inside OpenAI (he's an employee), so I wouldn't expect him to be leaking implementation details on twitter.