Basically, as I understand it, it divides AI systems (in the broadest sense Machine Learning sense) into risk categories: unacceptable risk (prohibited), high risk, medium/other risk, and low risk.
Applications in the high risk category include medical devices, law enforcement, recruiting/employment and others. AI systems in this category will be subject to the requirements mentioned by most people here (oversight, clean and correct training data, etc).
Medium risk applications seem to revolve around the risk of tricking people, for example via chatbots, deepfakes etc. In this case they require to “notify” people that they are interacting with an AI or that the content was generated by AI. How this can be enforced in practice remains to be seen.
And the low risk category is basically everything else, from marketing applications to ChatGPT (as I understand it). Applications in this category would have no mandatory obligations.
Not going to comment on how determining which risk category a given system falls into.
> clean and correct training data
If there's a country where majority of poor people have blue skin, and majority of rich people have green skin, and poorer people commit more crimes in general. Then if one of the features (explicitly defined or somehow inferred by model) of your ML model is skin color, it will correctly correlate blue skin as high risk for stealing. So a supermarket chain AI face detection would always flag those more often. And so you'd have AI-scale systemic racism.
Even if that application was categorized as "high risk", since it is "clean and correct training data".
How do you legislate that an AI system can't fall for "correlation != causation" and how can you audit that if there's no clearly define ML model feature, like the simple neural networks studied in ML 101.
Luckily, lawyers in the 1970s already figured this out, by anticipating the research on causality that computer scientists and mathematicians like Judea Pearl and Peter Spirtes did in the 1990s. Really!
In the first instance, you just can't use race as a feature, since it is a protected characteristic. But, you might also be worried that protected characteristics can generally be easily identified by looking for innocuous traits that correlate (since people tend to cluster into communities). For example, if you know an American's ZIP code and their three favorite musicians, you can determine their race with an accuracy in the high 90s. (Basically, the US is still just as segregated now as it was a hundred years ago, and black and white Americans tend to listen to different music.)
So when the US Civil Rights Act was passed, the courts came up with the idea of "disparate impact" -- when doing something like hiring, you are not allowed to base the decision on features that disproportionately affect one group rather than another, even if they are formally neutral, unless the feature directly impacts the ability to do the job.
In other words, you have to show that the features you are basing the decision on _causally influence_ the decision you are making, exactly like you see in Pearl's causal influence diagrams or structural equation models or whatever. Eg, if you want to hire a math professor, you can base the decision on the articles they published in math journals, but you can't base it on whether they like old school Goa trance.
So, what about black box neural networks, where you don't know which features are being used? In this case, it's pretty clear that you shouldn't use them directly when making a home loan, because the law wants to know what features are in use, and you can't answer the question of whether you're redlining when you have a black box. However, using black box techniques to learn (eg) the best random forest model to use is fine, because it lets you easily see which factors are going into the decision before deploying it.
FWIW, people have been doing this for decades already. (I did stuff like this back in the 1990s.)
Indeed, you’ve highlighted one of the potential problems with applying ML (a.k.a. a big pile of statistics/linear algebra/whathaveyou) in ways that threaten fundamental rights. In this case by creating a product that would discriminate people based on protected attributes like skin color. That’s illegal, even if your ML system picks up on that correlation. So you agree that we shouldn’t allow companies to ML-wash discrimination?
Maybe do a better job on the AI so it correlates poverty to crime instead of skin color to crime, or is able to spot laws created specifically to target and oppress the poor. If the punishment is a fine, then it's legal for the rich
I can think of a critical experiment there. Prepare a dataset with blue and green people acting the same way. If AI shows statistically significant misrecognition, then it's a failure.
Systematic discrimination in employment because the AI said so because the data was biased can ghettos and riots. Deepfakes are a storm in a teacup.
Remember how OpenAI's Dall-E lost to Stable Diffusion because they were too afraid that someone will make an image of a politician or something? Well we had some fun with Pope wearing Balenciaga and Trump being arrested but nothing bad happened.
I don't know, my impression is that this is written by someone who doesn't understand how law works and making hyperbolic assumptions through cherrypicking. I recall a similar panic and outrage when GDPR was introduced, claiming that it will bankrupt startups and small projects etc.
The EU stuff is usually against large corporations who pose systematic risks. Think about how TikTok makes the US freak out about China spying and manipulations, for EU it's the the same thing but include Facebook and all other large social media too(because American corporations are foreign in Europe like TikTok is foreign in USA). US considers banning TikTok, EU's approach is to regulate how data is processed and used in order to keep open market and mitigate risks.
So as a rule of thumb, Europol doesn't knock on your door when you train a model on data that doesn't meet the requirements. This is probably designed to make countries introduce laws which will hold Google/OpenAI etc. accountable for their services so they can't just shrug and say "ouppsy, it's not our fault the AI did it".
I'm sorry to pop the alarmist narrative, but this is not the legislation which is going to "get you".
> Europol doesn't knock on your door when you train a model on data that doesn't meet the requirements. This is probably designed to make countries introduce laws which will hold Google/OpenAI etc. accountable for their services so they can't just shrug and say "ouppsy, it's not our fault the AI did it".
Your use of “probably” is quite telling. Probably, as in, maybe not, maybe yes, who knows. As the act is phrased today, anyone who publishes a certain type of model, or a derivative of such, is a subject to certain legal obligations. Do you want to risk that Europol or any other task force knocks on your door, or hope every time that you slip under the radar?
Those acts, together with the CRA, are so vague that a lot of people will operate in a grey area. So maybe nobody knocks on your door for a year. Or two. Or five. But when they knock, good luck defending yourself from a legal action based on laws written by people who had so little idea they had to leave so many vague points open to interpretation depending on who doesn’t like you to what extent.
We've been there with GDPR. It didn't help creating the level playing field they officially wanted to achieve. BigTech has law departments whilst local tech startups have to deal with legal battles initiated by third-class legal firms crawling the web.
They just end up making their local tech entrepreneurs flee to US/UK.
Companies do these things all the time, I guess it's fun to imagine that opening offices in other countries is fleeing and that are fleeing to UK because they are having trouble processing personal data when developing drugs but I don't see why would that be the case.
Fun fact: UK data protection laws are about the same as EU.
As an entrepreneur you have to deal with finances, a great product and market competition. Why on earth would you additionally put legal insecurity on top, up until in a couple of years there might be enough jurisdictional rulings putting shallow definitions in EU regulations into context?
Fun fact: Why would UK miss the chance to attract tech companies and get rid of pre-Brexit laws after they went through all the unfortunate hassle Brexit was?
> Why would UK miss the chance to attract tech companies and get rid of pre-Brexit laws after they went through all the unfortunate hassle Brexit was?
Because UK is not made of libertarians who were under the EU opression but all those EU laws were made with UK's contribution and input. The UK has not become the libertarian utopia and will not become the libertarian utopia inside or outside of EU because the British public at large doesn't want it. If anything, probably the Labour party will win the next elections, the libertarians are very unpopular. British are not Americans and most don't trust corporations and want government involvement in protecting them against wrongdoings.
Brexit's biggest con was that EU is some 3rd party and the British public wants something else. In reality, Brits made the EU.
The benefit actually comes from forcing companies to keep good track of the personal data. Now, the personal data is something they need to think about how to handle and store, the downloading part for me so far was about curiosity.
The fact that you can download some data, presented as “all data”, doesn’t mean that it’s actually properly stored and handled. No way to verify or enforce it.
> The benefit actually comes from forcing companies to keep good track of the personal data
Who is checking for that? Can you give me an example of any public authority within the EU that has the operational excellence to even understand how BigTech is collecting and handling data? For Germany I only see Ulrich Kelber who certainly does not have the competence to even understand how BigTech is doing their things.
It's not about going around and checking but being responsible about it. Thanks to GDPR companies do have someone designated to deal with these things and if they screw up and something bad happens they are liable.
And here we are at the core problem: Implementing laws without being able enforce them is not only ridiculous but harmful for the credibility of any governmental institution. This is the core problem of all EU tech regulations.
Instead of developing operational excellence in their prosecution authorities to be able to track down and punish bad actors, they install regulations for everyone and try to simulate competence harming their own economy.
> Implementing laws without being able enforce them is not only ridiculous but harmful [...]
Not all crimes are preventable, which doesn't mean the laws aren't enforceable. Example: even though murder is illegal, people still commit murder, but they are rightfully punished for it, as the law mandates.
> [...] harming their own economy.
Do you have any proof that tech regulation is detrimental to the economy?
Oh they don't flee, they just don't bother to start new ones in the EU.
Which of the new AI labs is setup in EU?
Stability? Midjourney? Anthropic? Cohere? Every single one is either in the US or UK. LLM companies deal with huge legal uncertainties, especially around data and privacy. Hence investors are reluctant to fund any in the EU, and potential founders are deterred. This is all legacy of the GDPR and the 'regulatory superpower' mindset that the EU deluded itself into.
The only new wave AI company from the EU is DeepL, which is going to face really really intense competition from LLMs.
I have no idea why would you claim that with the introduction of GDPR Europe lost on tech. By tech I mean the SV industry, there are many other technologies out there.
Anyway, why all the "tech" is in SV? Do they have GDPR in the other states?
I don't know internals but would bet that it's an implementation of GDPR which happened shortly before Brexit. Do you really think after all that Brexit hassle they will miss the chance to attract tech companies?
Well the point is that Europe is helplessly missing globally competitive innovative shops in important IT sectors. This is a big economical and political problem which is not solved by referring to marginal copycats of existing US services.
As an employee of a European company which is chosen by European and US customers alike _specifically_ for our competence in European market regulation I see opportunity where you see shortage. Of course, we will never join the ranks of public celebrities such as Facebook or Google, but who cares, really. It's very likely you've interacted with our products in one way or another, we've made some revenue with it, and that counts. That the technology is coming from Europe, you'll never know.
Well I am hearing of them for the first time. Yeah I could be an ignorant, or they are not as relevant as you make it sound. A quick wiki search mentions they are best known for providing datasets for Stable diffusion and Imagen.
And if this is your counter example, then it proves my point that the Americans far more innovative currently.
I wonder if all those calling ChatGPT and the like “AI” when it’s nothing of the sort regret doing so now. AI is a scary word for certain groups, while machine learning (which is what this is) isn’t. Now you have a bunch of Luddites with pitchforks looking for a witch to burn.
What this act will do is severely stunt the European economy compared to the rest of the world, which will be racing ahead (as long as countries like the US don’t pass similar laws). By the time Europe realizes its mistake, it will be too late to catch up.
ChatGPT is most certainly AI: all of machine learning is a subset of the field of AI, as is much of symbolic logic and Bayesian inference. It's a massive and wide-ranging academic discipline.
Maybe you're thinking of "AGI", which is a term that appeared in the late 90's to make the distinction between the expert systems of the time and machines that could think like a human?
Do you think it’s coincidence that it came about during the middle of a massive hype wave with widespread talk of how AI will figuratively take over the world? Automated processes have existed for nearly 100 years.
Are you blaming them for pushing updates to existing regulations and reacting in a timely manner to the direction the markets are pushing their money into ?
Consider, for example, GDPR. Its implementation in Europe essentially forced compliance worldwide (with some smaller companies choosing to just not support users in the EU). And that's a good thing.
It's a large enough market that it can and should lead the way in sensible protections.
This isn't about luddites looking to burn witches. There are very sensible and immediate risks that we need to get ahead of. As someone in the ML/AI space I'm glad that many of the risks are coming to light. We don't need AGI for there to be serious problems with abuse of language models.
1) cookie banners are unrelated to the GDPR. 2) implementations of cookie banners that don’t allow users to reject cookies as easily as accepting them are illegal. 3) it is widely regarded as good that companies are obligated to disclose which third parties they are sharing your data with.
GDPR affects data, which is the basis of all computing. AI isn't - and this will mainly affect the companies building AI to forego offices and remote workers in the EU, not deployment in it.
when you call something “public action” people—in general—are all for it, but when you call it “government regulation” suddenly people start to get worried. when you say something is “literally” true, the literal and inflexible among us get annoyed when you don’t actually mean literally. words have fluid meaning and unusual connotations and if you can’t accept this you’re in trouble
Turing would argue that if the map resembles the territory so faithfully that you can't distinguish between them, the difference is moot. We're not there yet, but we're a hell of a lot closer than we were ten years ago.
> all training data be "relevant, representative, free of errors and complete."
This is especially interesting to me with regard to something like ChatGPT. As we know, ChatGPT occasionally gives factually incorrect information. Does this mean that, in its current form, it would be illegal in EU? We know that Google is currently blocking access to Bard in EU. Will ChatGPT be forced to follow suit?
ChatGPT is great and I love it. It would be a shame if I'm not even allowed to use it _at my own risk_ just because it might be wrong about some things. This seems like a simplification, but it sounds like EU is allowing Perfect to be the enemy of Good.
I have a feeling they might be looking for "equality" with this formulation. However, if it is representative of the real world, it will often not be in line with the norms prescribed by the notion of equality.
I am wondering what qualifies as "complete", though. Any reasonable definition I can come up with is redundant with "representative" and "free of errors".
From I read earlier (I did not waste time on this article again) EU rules in the propisal that is not definite are about critical stuff.
I agree that would be idiotic to let some greedy bastards sell some MedicalGPT to us, or PoliceGPT, SurveilenceGPT.
Imagine the MedicaGPT will give you different treatment each time you ask since is not deterministic, or if you change the patient name from Bob to John then it gives you some wild results because test data had tons of hon Smiths in and nobody can explain this AIs reasoning.
So IMO for critical systems we need good rules for safety reasons, for non critical systems we need transparency and if you sell an AI product you should also take responsibility if it performs worse then you advertise. Like you can't SELL me a GPT for schools with a shit disclaimer "it might be wrong sometimes and teach the students wrong stuff, or it might sometimes be NSFW" , IMO fuck this ToS where this giants sell us stuff and take no responsibility on the quality of the product.
Can you explain where I am wrong? ChatGPT is non deterministic, did OpenAI sabotaged it intentionally ?
I do not this tech banned, but regulated for safety reason in critical systems.
I already get daily spam emails from greedy fucks that want to sell me AI for X, where I am 100% this greedy fucks do not understand the science behind this stuff but just want to make money.
ChatGPT is intentionally nondeterministic for the same reason that GPT3 is nondeterministic by default. temperature>0 results in a better user experience. I'm having a hard time understanding why you'd think a neural network could unintentionally be nondeterministic. If you want inference to be deterministic, just use the same seed every time.
I also have no idea what your spam emails has to do with training models. The linked law prevents companies in the EU from releasing or deploying large models. It does not prevent grifters from spamming you. (not that there are any companies in the EU training state of the art models, but that's a separate issue)
I can't think of one example where someone was harmed by an LLM.
Besides "AI" is largely a marketing term, most software has "AI" elements and that has been the case for a while now, this thing has "unintended consequences" written all over it.
I don't think you should base regulations on these particular instances, the defamation one is describing a bug, it happens in every piece of software and I don't even want to address the first one.
Power grab and preservation. This will be quickly approved to preserve existing players and power status quo (lawyers, managers and doctors don't want to be automated away).
If an AI comes along that is able to do their job better than them then they will not have a say in that. No matter how many regulations the government puts up.
Is like trying to regulate cars to save the horse shoe industry.
And if doctors can be automated away so can software developers. I guess in the long term we are all obsolete.
> Is like trying to regulate cars to save the horse shoe industry.
Which was quite common in the early days of automobiles. There also wasn’t quite the massive bureaucratic regulatory system back then.
The big difference is these AI doctors will need to be certified by, wait for it… current doctors. Or, at the very least, the training data (haven’t RTFA so that’s the example being thrown around the comments) will need to be certified.
Judging by how the EU countries deal with labor issues I think there’s little chance robots will replace human doctors for a long, long time.
No , it wasn't always like that. The EU has gone on overdrive lately but that's not because EU citizens asked for it. This is mostly brussels people regulating popular subjects because ... they want to be associated with things that are popular. There is really little else to explain what's happening in the EU in the past 5 years. As a citizen I am concerned by this "tyrranny from Brussels" because my opinion never seems to have mattered nor were EU citizens informed before the fact.
Because there’s no better marketing for a probabilistic (dumb) language model than the idea that it’s so powerful it needs to be regulated.
Just as Steve Jobs used to say when the media was constantly stoking fears about the power of personal computers in the 1980s—“you can just throw it out the window.”
Yet, he simultaneously took advantage of that media hysteria when signing off on what is considered the best commercial of all time- “1984 won’t be like 1984 because Apple”
Essentially what OpenAI is doing right now in Washington (ie. stoking the fear and also selling the solution).
The more things change, the more they stay the same.
At first glance this looks like official information, but in fact it's a campaign site from https://futureoflife.org and should be clearly marked as such.
The act is dated 21. 4. 2021. More than two years old.
1.) being a part of the team working on this has to be among the most exciting legal jobs in Brussels
2.) I did not have time to read the entire act, not even sure if I'd understand it, but I'd be curious how much of it is still relevant given the leaps in both tech and especially popularity in the last two years.
Incidentally, for the many who claimed on these pages that we would not "have a definition of AI" (actually we have several), well, this legislative text provides one:
software with the ability, for a given set of human-defined objectives, to generate outputs such as content, predictions, recommendations, or decisions which influence the environment with which the system interacts, employing techniques including (a) Machine learning approaches, including supervised, unsupervised and reinforcement learning, using a wide variety of methods including deep learning; (b) Logic- and knowledge-based approaches, including knowledge representation, inductive (logic) programming, knowledge bases, inference and deductive engines, (symbolic) reasoning and expert systems; (c) Statistical approaches, Bayesian estimation, search and optimization methods
“Symbolic reasoning” used to make “decisions which influence the environment with which the system interacts” could describe almost all real world computer systems, could it not?
I was trying to understand more about AIA today after it was mentioned a few times in the oversight committee thing. Found this talk, it's is pretty good, I thought it was going to be lame content marketing but the guest is a real lawyer who seems to have a real understanding of AI and what is going on:
This is a very misleading website. It has an ".eu" domain name but it's nothing to do with the EU, rather it's from the Future of Life Institute.
This is bad because those people (the FLI) have weird political motivations, that do not automatically align with EU and human rights legislation that the new AI regulations try to protect. Whatever interpretation the FLI places on the EU act, should be treated with suspicion because of that.
“applications, such as a CV-scanning tool that ranks job applicants, are subject to specific legal requirement”
This is a continuation of EU logic first seen in GDPR around what that law calls “automated decision making”.
All I can say is that GDPR hasn’t had a good effect. Partly because It’s not well written from a technical perspective.
GDPR demands explainable and auditable automation. Non-deterministic AI systems make this difficult or impossible with current tech. So to be “compliant”, vendors dumb-down their software to use explainable methods and often inferior hiring decisions are made because users have to operate on untenable amounts of data using basic sorts. So the Talent Acquisition team end up structuring the hiring process around “disqualifers” such as resume gaps, education requirements, pre-interview qualification tests, etc.
It reminds me of an old recruiting joke:
“Recruiter: You said you only wanted to interview the 5 best applicants but we are getting so many applicants we don’t know where to start.
Hiring Manager: OK, first, I only hire lucky people. Print out all of the resumes and throw away every other one.”
Interestingly, if this process is done randomly without reviewing the resume, it’s considered legal.
It comes down to how you prove your blackbox is better than a biased but auditable (and fixable) process.
Even doing a few dozen audits of the AI run and coming up with better results, how can you assume these results will be consistent across thousands of resumes that will be blindly scanned ?
Usually stats could be applied, except as it's a black box, can we assume that behavior will have cobsistency ? (a dumb example: if the AI is somewhat influenced by dates, decision will drastically change as time goes by)
“Unacceptable risk”, “high risk”, “force for good”. Terms as vague and broad as an interstellar gas cloud. It makes me wonder if this is a strawman argument against regulation.
Basically, as I understand it, it divides AI systems (in the broadest sense Machine Learning sense) into risk categories: unacceptable risk (prohibited), high risk, medium/other risk, and low risk.
Applications in the high risk category include medical devices, law enforcement, recruiting/employment and others. AI systems in this category will be subject to the requirements mentioned by most people here (oversight, clean and correct training data, etc).
Medium risk applications seem to revolve around the risk of tricking people, for example via chatbots, deepfakes etc. In this case they require to “notify” people that they are interacting with an AI or that the content was generated by AI. How this can be enforced in practice remains to be seen.
And the low risk category is basically everything else, from marketing applications to ChatGPT (as I understand it). Applications in this category would have no mandatory obligations.
If you ask me, that’s a quite sensible approach.