Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If I've "learned" Harry Potter to the level where I can reproduce it verbatim, the reproduction would be a copyright violation. If I can paraphrase it, ditto.

Yeah, that's something that I've not seen a good answer to from the "everything AI does is legal" people. Even if the training is completely legal, how do you verify that the generated output is not illegally similar to a copyrighted work that was ingested? Humans get in legal trouble if they produce a work that's too similar. Does AI not? If AI doesn't, can I just write an AI whose job is to reproduce copyrighted content and now I have a loophole to reproduce copyrighted content?



Seems like so much tech "innovation" these days is really just to sneak around laws and social norms in pursuit of a rent-seeking position.


Nailed it. All of the “progress” in the last, say, 20 years is exactly this. They call it “disruption” and wear the title “disruptor” as a badge of honour.


I think you have to be practical. It would be difficult to train an AI to consume Harry Potter and compress it but prevent it from recreating it. You can try and people do, but there are always ways around it.

But it's on an individual prompt basis. It's not like ChatGPT can produce the entirety of its text and sell it as a pdf. It's just a device that could reproduce it much like a word processor is a device that you can read the book and type out the contents.

So the question is one of practicality. Do we ensure that no copyrighted material is in the training data? Difficult but probably not impossible. But what you can't do is target the content in all its various other forms, from descriptions of the plot, reviews, fan fiction, etc. So in the end its pretty much a lost cause.

So what to do about it? I don't know. In the utilitarian sense, I think the world in which this technology exists in a non-crippled form is a better richer world than one in which there are all these procedural steps to try to prevent this (and ultimately failing).

Whats the harm here? Are people not buying Harry Potter books and just having an LLM painfully recreate the plot? I would imagine Harry Potter fans would be able to explore their love of the media through LLMs and that would drive more revenue to Harry Potter media, much like fan fiction and pirated music lead to more engagement and concert sales.

In the case of new art, maybe fewer artists get commissioned, but let's be real, Mike Tyson wasn't going to contract out an artist to create a ghibli style animation of him anyway, so there's really little harm in LLMs here to artists. If anything it expands the market and interest.


I'm just going to briefly respond to the part you wrote about art in particular.

We may not have a way to actually quantify the harm that GenAI is doing to creative industries because some of the damage is long-term. Choices are being made right now based on the state of the world. Why would anyone start an art career in this climate? What does art as a profession look like in 5 years? 15 years?

Art is not just the final artifact, and I feel we're surrendering part of our humanity in service of enriching big tech companies.


> So what to do about it

We proceed towards AGIs that implement proper understanding, and have them read all of the masterpieces and essays and textbooks - otherwise they will be useless -, as is fully legitimate in any system that foresees libraries.


> Humans get in legal trouble if they [???] a work that's too similar

If they sell a work that's too similar.

What intellectuals do is quoting. Of course legal.


If gen-AI had been used to produce Nosferatu there still would have been a case, right?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: