Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's not open source, we have no idea about the data used to train the model, and the paper doesn't explain it all.


Is this an important consideration in open sourcing an AI model?

I would think the code to build your own is open sourced, and you can feed it any data you'd like. That's the open source part, not the part where they are running the model.

Have I misunderstood this?


It’s a common complaint on open sourced ML models that they don’t provide or describe the data used to train the model. Sometimes it’s a valid complaint, since it may not be clear what kind of data was used to train the model, and sometimes it’s not since it’s clear.

I think it’s kind of an overdone complaint and I usually ignore it, and besides it looks like there’s a huggingface project ongoing where they’re trying to replicate the training process for this model anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: