Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How sturdy is this claim?

If we presume it's illegal to train on copyrighted works, but Wikipedia, a website summarizing the article is perfectly legal, then what would happen if we got LLM A to summarize the article and use that to train LLM B.

LLM A could be trained on public domain works.



If it is illegal to train on copyrighted work, it will also benefit actors that are free to ignore laws, like Chinese public private companies. It means Western companies will lose in the AI race.


Then we don't respect their copyrights? Why is this some sort of unsolvable problem and the only solution is to allow mega corporations to sell us AI that is trained on the work of artists without their consent?


LLM B would be a very bad LLM with only limited vocabulary and turn of phrase, and would tend to have a single writing tone.

And no, having 5000 different summarizing LLMs doesn't help here.

It's sort of like taking a photograph of a photograph.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: