Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Plundervolt: Software-Based Fault Injection Attacks Against Intel SGX [pdf] (plundervolt.com)
108 points by xucheng on Dec 11, 2019 | hide | past | favorite | 63 comments


At this point, I really don't understand how people can still believe that SGX is ever going to work. The threat model is so incredibly hostile that it is basically impossible to create something that isn't vulnerable through some kind of side channel or physical manipulation such as this. When it is compromised at just one site, then the whole security model topples. In the end, you will only be able to "securely" deploy it in environments where you trust the parties to not tinker too much with it. But in that case, why not just go for something a lot simpler when you already rely on trust?


The idea behind sgx is that you don't need to trust your cloud provider.

There are already simpler solutions for local execution.


These kinds of brownout/undervolt attacks have been used in console cracking for decades.

Surely someone involved would have known about that. I wonder what chain of events led to the creation of a secure enclave with such well understood flaws.


> Surely someone involved would have known about that. I wonder what chain of events led to the creation of a secure enclave with such well understood flaws.

Don't you think that most engineers will just laugh the idea off, and do just enough to make their PMs happy?

Such nebulous concepts as "secure enclave" only exist in imagination of MBA types. Even the name "enclave" sounds like it was picked by a seasoned talkshop person.


Naw. It's just the same age old issue of someone writing the design specs with the illusion it will be followed. Instead of designing it so that they can't not be followed.


While the premise is similar, this is much different from the decades old brownout attacks. The researchers are causing the undervolt with software, not by manipulating the supply lines to the chip.


One of the major elements of a TEE is that I can send code to an untrusted party and expect it to be executed faithfully. Under that model I have to assume that they have physical access to their own hardware and could do hardware based attacks.

Which means the model is just as broken even without it being a software attack.


Yes but not really? Intel just made it easier than oldschool hardware attacks:

> adversary abuses an undocumented Intel Core voltage scaling interface

Instead of having to buy hardware to mess with the power rail in a controlled manner, you can now just do it with a code snippet!


And you can do it remotely. But the certainly made it easier because this is (presumably) behind the brownout detectors and other circuitry usually attached to the supply lines to combat this sort of stuff.


I understanding how screwing with the voltage can cause execution errors. If a random flop somewhere in the design flips state (or fails to when it should) the resulting behavior could be anything, including hanging the chip.

How does such a random event get turned into an exploit?


There are a bunch of different methods but the old school method was mainly to make a control point to make an incorrect decision.


Don't you think Intel's "fix" of disabling DVFS amounts to a post-release product downgrade?


All their security fixes are. My 2 year old laptop feels noticeably slower in day-to-day usage now. I haven't tested it, but I would bet a small sum of money that it would go back to its original 2-year-ago performance with mitigations=off specified to Linux.

If I can't undervolt anymore, then my laptop will be unable to run the CPU at 100% without thermal throttling. Not Intel's fault that Dell used a shitty cooling system and a too-high default voltage, but it's yet another penalty that I have to take because Intel failed.


Intel is playing a dangerous game if you ask me. AMD lost a class action lawsuit because someone was disappointed in the performance of an AMD chip even though all the performance metrics were available for them to research[1]. Intel is still selling hyper-threaded processors. You cannot buy an Intel laptop that isn't pre-configured to be vulnerable to Zombieload[2], unless you find one of the 4 obscure Intel chips like the Core i5-8500B Processor in a laptop

[1] https://www.theverge.com/circuitbreaker/2019/8/28/20837336/a... [2]https://zombieloadattack.com/


I doubt they're disabling DVFS completely.


Most comments simply missed the point.

Power analysis attack and power glitch attacks are well-known in cryptography and electronics. The classic technique is to monitor the Vcc voltage and current on an oscilloscope and try deducing the internal operation of the chip and extract the secret, or to inject a glitch by on the Vcc rail to induce a fault. A classic side-channel attack, but these attacks required complete physical control over the hardware, and you need to put a dozen of probes on the motherboard in a lab with an elaborate setup, so it was typically not a concern unless it's a crypto key or something (if so, it would be done on a separate security chip with physical defense internally).

But this attack showed that, since every CPU and SoC now has builtin dynamic voltage scaling and power management, by using these features, you can use the CPU to launch a power analysis attack against itself, and you don't even need to touch even a single trace on the PCB, the attack can be launched remotely, and all you need is root access!

This is frightening. Who knows what is going to be the next.


Terrible for those of us who use undervolting to keep their laptop as cool as possible. I hope they allow it enabled in the bios settings.


"After carefully reviewing the CPU voltage setting modification, Intel is mitigating the issue in two parts, a BIOS patch to disable the overclocking mailbox interface configuration. Secondly, a microcode update will be released that reflects the mailbox enablement status as part of SGX TCB [Trusted Computing Base] attestation. The Intel Attestation Service (IAS) and the Platform Certificate Retrieval Service will be updated with new keys in due course. The IAS users will receive a ‘CONFIGURATION NEEDED’ message from platforms that do not disable the overclocking mailbox interface.”

Seems like you should be able to just avoid applying the bios patch. That said, you won't be able to keep your BIOS up to date.


But then again, the researchers state: "In the paper, we show that Plundervolt may affect SGX's attestation functionality" so they are apparently able to fake at least part of the attestation.

Also it will be interesting to see whether directly talking to the voltage regulation controller on the mobo will circumvent the microcode protection.


AFAICT, they were able to randomly flip bits in the enclave. Which isn't great, but doesn't yet break remote attestation as commonly used for DRM schemes, which involves online communication with both Intel and the enclave software provider. They can simply reject attestations if they detect too many authentication attempts from the same CPU, long before you manage to flip the correct bit.

IIRC, reading memory out of the enclave has already been accomplished with other side-channel attacks, so that aspect has been broken. What would be novel is if you could read the per-CPU secret key[1], but that doesn't appear to be what they've done.

[1] Which only Intel knows, which is why remote attestation requires a service contract with Intel for online verification of attestation responses.


> whether directly talking to the voltage regulation controller on the mobo

Is it even possible to do that without physical access?

Can you confirm that? Do modern motherboards expose an interface that allows directly control over the voltage regulation in software without physical access?! If so, it's terrifying...


I am talking working around attestation for someone who operates the computer and wants to subvert SGX enclaves. Whether you need physical access or not is thus inconsequential.


Wait, I won't be able to undervolt my CPU? What abput on linux? Intel decided that disabling functionality is a solution? Time for another PS3-type lawsuit.


FWIW, SGX is in Skylake and later. Time for AMD to shine I guess


the paper mentions TrustZone has (had?) similar issues:

> In 2017, Tang et al. [65] discovered a software-based fault attack, dubbed CLK screw. They discovered that ARM processors allow configuration of the dynamic frequency scaling feature, i.e., overclocking, by system software. Tang et al. show that overclocking features may be abused to jeopardize the integrity of computations for privileged adversaries in a Trusted Execution Environment (TEE). Based on this observation, they were able to attack cryptographic code running in TrustZone. They used their attack to extract cryptographic keys from a custom AES software implementation and to overcome RSA signature checks and subsequently execute their own program in the TrustZone of the System-on-Chip (SoC) on a Nexus 6 device.

it further notes though:

> However, their attack is specific to TrustZone on a certain ARM SoC and not directly applicable to SGX on Intel processors.

(ARM TrustZone is what AMD has in their chips)


> (ARM TrustZone is what AMD has in their chips)

Well, I'll be damned: "The PSP itself is an ARM core inserted on the main CPU.[6]" (https://en.wikipedia.org/wiki/AMD_Platform_Security_Processo...)

I'd imagine that the ARM core on the AMD CPU doesn't support changing the clock rate or anything much though.


Even if it does, you cannot execute unattested code on it and the voltage is pegged to CPU IO voltage. Undervolting that risks big fat immediate system crash, unlike undervolting a core.


I did a quick search on the paper, didn't find "google" nor "titan", which is the secure chip made by Google. Would it be vulnerable too?


Intel just keeps getting kicked in the nuts, huh

There is no way to get the most performance out of your Intel chip without undervolting - especially on mobile, they run really hot under constant load and often throttle. Manufacturers using barely capable heatsinks doesn't help.

Time to switch to AMD when the new Zen mobile chips arrive.


Honestly, I was considering to return to AMD while getting a new system. This is the final nudge for switching back.

Thanks Intel, you really tried and succeeded.


There is no such thing as a "trusted" computing.

The very same DFVS is also possible to exploit for side channel attacks. Say, one branch makes the processor kick in in a higher gear, and over millions of branches, you can reliably deduce branch result for operation behind the MCU barrier.


I wonder, with formal verification being a thing, could you formally verify that a chip would be resistant to all types of power attacks according to the current laws of physics?

Such a proof can never be final, because the laws of physics aren't either. But just because it's not perfect doesn't mean it wouldn't be good.


An Intel CPU is a pretty large design. I'm not sure it's feasible to run all formal verification benchmarks on a full RTL design of the entire chip, including caches, memory controllers, etc.

Not to mention - these sort of fault injection attacks are way past issues in high-level RTL, and instead are in the post-synethesis, analog/metastable domain specific to a given target process. Simulating those effects sounds prohibitively expensive, and I'm not sure there is any existing formal verification suite that can even do that.

Unless someone modeled at a higher-level that modifying an MSR can cause undervolting, and then this undervolting can in turn cause bitflips to occur, this wouldn't have been caught. And if they did model it, they would have probably thought of this attack anyway :).


> An Intel CPU is a pretty large design. I'm not sure it's feasible to run all formal verification benchmarks on a full RTL design of the entire chip, including caches, memory controllers, etc.

Agreed. It's still possible to verify smaller components of the design, though.

> [...] These sort of fault injection attacks are way past issues in high-level RTL. [...] Simulating those effects sounds prohibitively expensive, and I'm not sure there is any existing formal verification suite that can even do that.

It's possible to have a "high-level RTL" design that inherently resists some types of fault injection attacks. TMR is a trivial example: https://en.wikipedia.org/wiki/Triple_modular_redundancy

In fact, there is a wide body of literature studying the application of formal verification to side-channel and fault-injection analysis. Some systems can even synthesize a fault-injection resistant design. Unfortunately it is not realistic to be resistant to "all types of [fault injection] attacks according to the current laws of physics". We can, however, make different models of fault attacks and then prove (or synthesize) that some design is resistant to attacks in that model.

If you're interested, look for publications in CHES, TCAD, FMCAD, DAC, DATE, etc. with keywords like "DFA", "DFIA", "SAT", "fault injection", etc.


I think formal methods are all using some kind of discrete logic. Implementing physics with these things sounds really, really, hard. I guess designing your model would require feedback from layout and fabrication techniques, and it would suffer this same maintenance burden. Once you have the simulation, it would take ages of computation power (much more than functional RTL simulations do).

I think this is the kind of thing that might happen in 2050 but for now it just sounds infeasible.


Probably not, because you can inject voltage in an effectively infinite number of physical locations. (For example, nondestructively, with RF.)


Undervolting glitches are a fundamental property of synchronous logic and can't be fixed. Asynchronous logic, OTOH...


Important caveats, for the lazy:

- SGX is disabled by default, it has to be enabled for this exploit to be relevant

- POC requires privileged execution, at which point you can safely assume all is already lost

Anyone who has spent time around digital logic circuits will know that messing with voltages will cause errors. If the power lines are too low some transistors will not be able to switch their load. Or too high and you will cause parasitic losses or capacitance in unexpected places. This is actually a really nice attack to show off to people with an interest in computer/electrical engineering because it demonstrates how a basic design constraint can cascade in unexpected ways.


> SGX is disabled by default, it has to be enabled for this exploit to be relevant

This is an attack on SGX. If you are not using it, it is irrelevant regardless of whether it is enabled.

> POC requires privileged execution, at which point you can safely assume all is already lost

For SGX, this is different. The threat model behind SGX is that anything outside SGX (including OS, BIOS, motherboard, etc.) is untrusted. The whole motivation behind SGX is to create a trusted environment in an untrusted host.


Those aren't caveats at all.

First, this is explicitly a vulnerability compromising SGX itself, so saying SGX has to be enabled is tautological. It's not a vulnerability attacking Intel processors generally, and it's not being billed that way either.

Second, securing memory against untrusted privileged execution is a defining characteristic of SGX, so it's likewise unsurprising that it would be required. That is quite literally the intended purpose of SGX.

The design thesis of SGX is to prevent "all hope is lost" from being true in the context of privileged execution.


isnt Netflix using SGX for DRM?


Not on the client side, at least. My desktop doesn't have SGX capabilities, yet I watch Netflix.


They use it on client side for higher resolutions, you will not be able to watch 4k content on laptop/PC without Intel SGX.


Actually my plan is not UHD since my displays don't go up to 11. Do they warn me if they block UHD streaming due to hardware deficiencies (i.e. SGX not available)?


Possibility to choose 4k quality will not be available on that kind of platform (Widevine L3).


Does anyone have a ELI5 explanation?


SGX is Intel's TEE (Trusted Execution Environment) platform, where an isolated memory address space (aka enclave) can host applications which give certain guarantees:

  * No outside code can gain access to any part of it - not even the host OS
  * Applications run precisely as executed (binaries need to be chryptographically signed)
You could generate cryptographic keys inside the enclave and ensure that the private key is never seen in cleartext outside it.

Let's say you're building a SaaS product to host sensitive data and expose an API. Using SGX, the remote endpoint can be ensured of the above.

Root keys are signed by Intel (and recently delegated to cloud providers) - so there is some element of trust there.

This attack uses power management interfaces to the hardware to induce faults in computations in the enclave, leading to data leaks. In the example above, the machine operator (or someone who gained privileged access on their machine) could abuse this to get access to sensitive data that's supposedly only inside the enclave.

In any case and even in the absence of these threat vectors, SGX and similar solutions should be used as a layered security approach, not something to rely solely upon.


Yeah, this is exactly correct. Sgx is not a silver bullet, at all. It’s a part of a security story.



Intel SGX is a Secure Enclave embedded in Intel CPU, which aims to provide secure and trust computation environment. For example, storing and processing cryptography secret keys, etc. It is similar to the enclave in Apple A chips used in Touch ID/Face ID. Currently, there is no wide adoption of SGX in the production environments, mainly due to the attacks like this.

In this attack, the researchers found that by adjusting the voltage and clock frequencies of the CPU, they are able to generate errors inside SGX enclave and recover the secret keys inside.


IME lack of adoption is more due to lack of tooling and integration with popular languages and libraries.

Also, the domains that have the most to benefit from SGX (heavily regulated ones like healthcare) tend to be very slow adopters of new technology.

I guess there is definitely some level of concern for sidechannel attacks, given Intel's track record, but I don't think that's whats been holding back adoption.


It also requires a very expensive license from intel, and to allow them partial control of your code via codesigning.


No it doesn't. Getting a whitelisted code signing key just requires you to agree that you won't distribute malware. You pay nothing for it and Intel don't see the code you sign. Please don't make things up because they "sound right".


Are there non-Intel remote attestation servers now?


Only Intel know what chips they've manufactured and what microcode patch levels are currently considered secure, so that wouldn't make much conceptual sense. But the new DCAP feature lets you run some of the RA infrastructure yourself, yes.


How could healthcare benefit from SGX?


Strengthened protection for patient data.


> Currently, there is no wide adoption of SGX in the production environments, mainly due to the attacks like this.

How long have these attacks been around? I only started hearing about them maybe like... in the past week or so? Nothing remotely close to explaining why these aren't used in production.


There are other attacks (https://arxiv.org/pdf/1802.09085.pdf), but I doubt they’re the major driving force behind lagging adoption.


Intel SGX is an enclave, or a hardware supported mechanism to store keys/do protected operations (most often implemented as essentially a co-processor). For example, if you want to run some AES operation, you would send it off into the enclave.

More on faults-- A "fault attack" (I like to call them active attacks) is an attack where someone actively tampers with a device to "force" it into an "unreachable" state. An example of an "unreachable" state might be, for a single cycle, cutting off the voltage to a flip-flop to prevent a bit from being written. Contrived example, but enough to get the idea of "fault" in your head hopefully.

Intel SGX is heavily advertised as being fault-resistant (although in recent years, time and time again it has shown to not be the case!). This is especially crucial because some crypto algorithms (e.g. RSA-CRT and AES in general) heavily rely on zero/minimal faults due to the nature of how these algorithms are constructed. RSA-CRT, for example, only needs a single fault in its algorithm for the private key to be compromised.

CPU design today will also typically employ frequency or voltage scaling based on workload as a power saving metric-- say, if the CPU is idle. They will also make this ability software-visible, as it's quite handy for general uses as a whole.

The researchers here exploit this ability to "cut the voltage" from software in the CPU to directly affect SGX as well, resulting in faults in the underlying operations.

While fault attacks (especially timing ones) on SGX are nothing new, the shocker frankly is how they did the fault attack. Typically, in my experience, whenever someone says "power fault attack" you might envision someone with physical access to the chip "glitching" the clock or VCC to bypass some verification step, maybe in a lab or using a ChipWhisperer. This is obviously not the case here-- this is the first case of software power glitching I've seen (note that software clock glitching was done a few years back, but not on Intel machines I believe).

I feel the need to point out that this isn't a fault (pun intended) of Intel's design-- OK maybe it is a little bit since this does fall under the SGX-type threat model, but I feel like this would more be an industry wide issue for any system with an actual hardware enclave (looking at you Apple), so Intel is unlikely alone in the blame, in the same way that speculative execution was (and still is, in my opinion!).


It's not particularly novel to be honest. Someone already did a very similar attack a few years ago on ARM Trustzone (see a paper called "CLKSCREW: Exposing the Perils of Security- Oblivious Energy Management"). It was pretty well known that type of attack is likely possible on SGX too.


When I say 'not novel', I mean not surprising such an attack is possible. The paper itself is good work, the devil is in the detail.


> note that software clock glitching was done a few years back




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: