Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In addition to what was said, if its anything like DPO you don't need a lot of data, just a good set. For instance, DPO requires "good" and "bad" responses for each given prompt.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: