Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It really depends on what you're working on and what was included in the training data of the model you used. From a model architecture point of view, they're basically all the same, the main difference lies in the training data.


Also not true. Even the API surface differs


API is irrelevant. It's like saying that talking to John via Telegram or WhatsApp is like talking to a different person.


I agree here a fair bit, not that I'm an expert or anything. I'd like to see some progress on some of the neuronal modelling. It seems since 'attention is all you need' they've locked into this LLM stack and gluing up models as data pipelines rather than integrating different NN's on a deeper level.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: