Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Strawberry/Q relies on elements similar to an element of AlphaZero to find "strategies" suitable for a problem. That takes it furter from "next-word-prediction" than current models, and improves quality of the output*

Genuine question: is this independently substantiated? Or Altmanspeak?



The whole comment is "altmanspeak", which we might more directly call, "ideological descriptions of AI consistent with increasing stock prices"


The specifics are kept secret, but all the labs appear to work on some variant.

It seems to be related to the DeepMind reference in [1] and most of [2].

[1] https://en.wikipedia.org/wiki/Q-learning [2] https://arxiv.org/pdf/2403.09629


This sounds like chess AI


More generally, it's part of what we call reasoning.

As opposed to do what first comes to mind, which would be similar to what regular LLM's have been doing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: