I think it's slightly less ridiculous than it sounds, because governments have much more power over their own citizens. As an American I would dramatically prefer the Chinese government to spy on me than the American government, because the Chinese government probably isn't going to do anything about whatever they find out.
(That logic breaks down somewhat in the case of explicitly negotiated surveillance sharing agreements.)
> because the Chinese government probably isn't going to do anything about whatever they find out.
This really depends. If a foreign adversary's surveillance finds you have a particular weakness exploitable for corporate or government espionage, you're cooked.
Domestic governments are at least still theoretically somewhat accountable to domestic laws, at least in theory (current failure modes in the US aside).
Exactly and that danger grows as the ability to do so in increasingly automated and targeted ways increases. Should be very obvious now looking at the world around us.
Also, failing to consider the legal and rights regime of the attacker is wild to me. Look at what happens to people caught spying for other regimes. Aldrich Ames just died after decades in prison, and that’s one of the most extreme cases — plenty have got away with just a few years. The Soviet assets Ames gave up were all swiftly executed, much like they are in China.
Regimes and rights matter, which is why the democracy / autocracy governance conflict matters so much to the future trajectory of humanity.
> As an American I would dramatically prefer the Chinese government to spy on me than the American government, because the Chinese government probably isn't going to do anything about whatever they find out.
> spy on me
People forget to substitute "me" for "my elected representative" or "my civil service employee" or "my service member" or their loved ones
I, personally, have nothing significant that a foreign government can leverage against our country but some people are in a more privileged/responsible/susceptible position.
It is critical to protect all our data privacy because we don't know from where they will be targeted.
Similarly, for domestic surveillance, we don't know who the next MLK Jr could be or what their position would be. Maybe I am too backward to even support this next MLK Jr but I definitely don't want them to be nipped in the bud.
I don't really understand the criticism. The authors aren't claiming to have the strongest chess engine without search. They are just showing that they got a chess engine to a respectable level with their process, which is somewhat different from LC0. They do in fact explain that explicitly:
> Leela Chess Zero’s networks, which are trained with self-play and RL, achieve higher Elo ratings without using explicit search at test time than our transformers, which we trained via supervised learning. However, in contrast to our work, very strong chess performance (at low computational cost)
is the explicit goal of this open source project (which they have clearly achieved via domain-specific adaptations). We refer interested readers to [https://arxiv.org/abs/2409.12272] (which was published concurrently to our work) for details on the current state-of-the-art and a comparison against our network.
And I don't think the criticism of their writing is on point either. I don't think they are secretly implying that their engine is better than Stockfish. And it's 100% plausible for human masters to rigorously analyze many positions with engine assistance and correctly establish whether Stockfish's evaluation is right or not.
First of all the title is misleading: "GM level" to most of us means moves of the quality that a GM makes when playing at classical time control. As of several years ago, LC0 needed around 35 search nodes per move to do that. With LC0's new transformer architecture, that number has probably gotten a lot lower, but not all the way down to 0. Second of all, the article complains about the Google paper not citing some other publication. So that's a concrete criticism though I haven't checked its validity.
You're free to believe whatever fantasy you wish, but as someone who frequently consults an LLM alongside other resources when thinking about complex and abstract problems, there is no way in hell that Karpathy intentionally limits his options by excluding LLMs when seeking knowledge or understanding.
If he did not believe in the capability of these models, he would be doing something else with his time.
One can believe in the capability of a technology but on principle refuse to use implementations of it built on ethically flawed approaches (e.g., violating GPL licensing laws and/or copyright, thus harming open source ecosystem).
Conflating natural law -- our need to eat -- with something we pulled out of our asses a couple hundred years ago to control the dissemination of ideas on paper is certainly one way to think about the question.
I am sure it had nothing to do with the amount of innovation that has been happening since, including the entire foundation that gave us LLMs themselves.
It would be crazy to think the protections of IP laws and the ability to claim original work as your own and have a degree of control over it as an author fostered creativity in science and arts.
Innovation? Patents are designed to protect innovation. Copyright is designed to make sure Disney gets a buck every time someone shares a picture of Mickey Mouse.
The human race has produced an extremely rich body of work long before US copyright law and the DMCA existed. Instead of creating new financial models which embrace freedoms while still ensuring incentives to create new art, we have contorted outdated financial models, various modes of rent-seeking and gatekeeping, to remain viable via artificial and arbitrary restriction of freedom.
Patents and copyright are both IP. Feel free to replace “copyright” with “IP” in my comment. Do you not agree that IP laws are related to the explosion of innovation and creativity in the last few hundred years in the Western world?
Furthermore, claiming “X is not natural” is never a valid argument. Humans are part of nature, whatever we do is as well by extension. The line between natural and unnatural inevitably ends up being the line between what you like and what you don’t like.
The need to eat is as much a natural law as higher human needs—unless you believe we should abandon all progress and revert to pre-civilization times.
IP laws ensure that you have a say in the future of the product of your work, can possibly monetise it, etc., which means a creative 1) can fulfil your need to eat (individual benefit), and 2) has an incentive to create it in the first place (societal benefit).
In the last few hundred years intellectual property, not physical property, is increasingly the product of our work and creative activities. Believing that physical artifacts we create deserve protection against theft while intellectual property we create doesn’t needs a lot of explanation.
What you see as copyright violation, I see as liberation. I have open models running locally on my machine that would have felled kingdoms in the past.
I personally see no issue with training and running open local models by individuals. When corporations run scrapers and expropriate IP at an industrial scale, then charge for using them, it is different.
I have not researched closely enough but I think it falls under what corporations do. They are commercially licensed, you cannot use them freely, and crucially they were trained using data scraped at an industrial scale, contributing to degradation of the Web for humans.
Since Llama 2, the models have been commercially licensed under an acceptable use policy.
So you're able to use them commercially as you see fir, but you can't use them freely in the most absolute sense, but then again this is a thread about restricting the freedoms of organizations in the name of a 25-year-old law that has been a disgrace from the start.
> contributing to degradation of the Web for humans
I'll be the first to say that Meta did this with Facebook and Instagram, along with other companies such as Reddit.
However, we don't yet know what the web is going to look like post-AI, and it's silly to blame any one company for what clearly is an inevitable evolution in technology. The post-AI web was always coming, what's important is how we plan to steward these technologies.
The models are either commercial or not. They are, and as such they monetise the work of original authors without their consent, compensation, and often in violation of copyleft licensing.
> The post-AI web was always coming
“The third world war was always coming.”
These things are not a force of nature, they are products of human effort, which can be ill-intentioned. Referring to them as “always coming” is 1) objectively false and 2) defeatist.
I was interested to read this because some time ago I had my genome sequenced by Nebula. If you look at the lawsuit you can see that what Nebula did was use off-the-shelf third-party analytics products on their website, including recording analytics pings when users buy a kit, and pings when users use the Nebula website to browse Nebula's high-level analysis of their traits (leaking that the user has those traits to the analytics provider.)
This behavior represents a contemptible lack of respect for users' privacy, but it's important to distinguish it from Nebula selling access to users' genomes.
That's a good clarification. I read through some of that link, and it does look relatively benign - Meta & Google pixels might see when you buy a kit but nothing more, but on page 21 they directly leaked genetic information to Microsoft via their Clarity tracker. Not intentionally maybe, questionable if it can be linked to a person specifically instead of just an advertising ID but they did leak that. I think the lawsuit says that even disclosing whether a person has undergone genetic testing is in violation of GIPA, so the information they sent to all 3 is enough to violate that.
I don't have any evidence they're selling anything but that lawsuit shows pretty sloppy behaviour for a company that should be thinking very deeply about privacy. I guess that's about what you said though :)
Another point is that Wojicki's big idea that all this genetic data would be useful to sell to business, didn't work out so well. For an advertiser, it's a lot more useful to know if you're a smoker, than to know that you have a 40% higher chance of being a smoker.
The point isn't what they are doing with your data now, but that they retain your data and what might happen in the future. Someone with malicious designs on your DNA might buy Nebula tomorrow and there's nothing you can do about it.
Actually, the main reason I used Nebula was that they advertised a credible-to-me promise that you could download and permanently delete your data upon request. That was some years ago, so I don't know if I would trust them today. But that was their claim, and I have no reason to believe they didn't delete my data.
That's a legal requirement in the EU and many US states. Some of the genetic genealogy companies actually play fast and loose with it though - not the deletion, which I trust, but the data portability and reasons to store PI parts.
(That logic breaks down somewhat in the case of explicitly negotiated surveillance sharing agreements.)
reply