Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree that you should call the output copyright infringement (or not) without regard to how you got there. So, if it produces an identical copy of the text, or of e.g. Indiana Jones, that you then distribute it, sure, that is copyright infringement.

But the mere act of using them for training and producing new works shouldn't be! In fact, until 2022, pretty much no one regarded it as a copyright violation to "learn from copyrighted works to create new ones" -- just the opposite! That's how it's supposed to work!

Only when hated corporations did it with bots, did the internet hive mind suddenly decide that's stealing, and take this expansive view of IP rights (while, of course, having historically screamed bloody murder about any attempts to fight piracy).



Scale matters. There's a difference between individual humans taking time to study copyrighted works, and so-called "AI" doing it on the scale of the equivalent of millions of man hours.


But what about that kind of scale matters to this particular area? The article spends a lot of time showing the substantive parallels between how humans and AIs learn e.g. in how they never store an exact copy, but a high-level understanding that rarely produces anything verbatim.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: