> There is a transition period (the rightmost of the two shaded regions, in orange14) of ~11 μs15 where the CPU is halted: no samples occur during this period16. For fun, I’ll call this a frequency transition.
It stops executing for 35 thousand cycles. I call that a "lockup" "as it shifts".
This isn't the real problem though...the problem with the existing AVX-512 implementations is the "relaxation time" causes subsequent scalar code to be slow.
From later in the same post:
> Here, we have the worst case scenario of transitions packed as closely as possible, but we lose only ~20 μs (for 2 transitions) out of 760 μs, less than a 3% impact. The impact of running at the lower frequency is much higher: 2.8 vs 3.2 GHz: a 12.5% impact in the case that the lowered frequency was not useful (i.e., because the wide SIMD payload represents a vanishingly small part of the total work).
Interestingly enough, this is another feature that is supposed to have been improved on server Icelake. The frequency transition halt time is now pretty much negligible. The "core frequency transition block time" goes from ~12 us on CLX (similar to the number quoted above) to ~0 us on ICX.
Either way, it's a big problem that certain single instructions can cause this transition. When the transition is based on a usage threshold of heavy instructions, it's not so bad. And with this revision, more of the transitions are based on threshold. But there are still some instructions that cause an immediate frequency change, if I'm reading the articles right.
No...the whole point is that the single instruction induced halt for downclocking isn't a real issue. Even in the pathological case where you insert a single instruction spaced 760 us apart in order to induce the maximum number of clock shifts, the total performance degradation due to the clock halts is only 3% (the frequency drop that is induced by the use of these instructions has a much larger impact.) Furthermore, on Icelake-SP, the halted time due to frequency transitions is supposed to go to 0, which makes this aspect of the problem entirely irrelevant.
Yes, if you insert a single 512-bit FMA that runs every so often in your code you will get a 15% performance hit from the lower frequency, but that's much less likely than the old case where people who were trying to use AVX-512 for memcpy and the like would slow down scalar code.
But they fixed the old case, by having a minimum number of heavy instructions before changing clocks. If you have some instructions there just for the occasional memcpy, it will be a little slow during the memcpy but it won't downclock and the overall impact will be very small.
Now that the older and bigger case is fixed, this case remains the last sticking point. Because you still can't trust the CPU to do the right thing when there are a small number of heavy instructions. Even if they cut the halting time to 0, it's still bad for a single instruction to cause a prolonged downclock.
It is definitely a problem of defective hardware. But there are plenty of people out there with both Intel and Ryzen systems who disable every C-State and every Turbo setting because any change in power levels causes their system to crash. Lockup, bluescreen, kernel panic or just computation errors.
Some of that is from overclocking. Some is from old and/or defective power supplies. Some is from motherboard VRMs. And some, like the original Ryzen 1700X, is from bad SOC-internal power management.
At any rate, I have read forum posts reporting system failures caused by AVX. Either overcurrent or clockrate changes crashing it.
So it seems to have support. I wouldn't call that "misinformation."
> But there are plenty of people out there with both Intel and Ryzen systems who disable every C-State and every Turbo setting because any change in power levels causes their system to crash.
There are plenty of people with both Intel and Ryzen systems that are straight-up broken and don't power on at all.
Defects happen. Improperly designed systems happen. Misconfigurations happen. While those situations are unfortunate for the small percentage of people experiencing them, they shouldn't be used to judged the platform's capabilities as a whole.
> So it seems to have support. I wouldn't call that "misinformation."
Using anonymous, largely unverifiable anecdotes posted on web forums as evidence for a population-wide problem is a textbook case of selection bias.
And if you're OK with that, let me throw in my own anecdote.
All of my systems, which include:
* Skylake, Coffee Lake, Haswell/Broadwell, Sandy Bridge/Ivy Bridge, and Westmere/Nehalem Intel processors,
* Zen 2, K10, and K8 AMD processors,
...have been able to reliably execute supported vector instructions (SSE, AVX, etc.) for extended periods of time without any problems.
Google still owns Android - an OS the majority of the world uses, and is required to ship with Chrome (among other programs) pre-installed if the OEM also wants access to the Google Play Store.
Maybe there is a better word than “Nuke” to describe this? Perhaps a word not so strongly associated with death and destruction, unless you truly mean the physical locations of these orgs should be hit with atomic weapons?
I think a starting point for discussion is to get past equating antitrust issues with monopoly status. In the previous century, these things went together coincidentally, but the scale of our economy has grown to a point where it is no longer necessary for the largest companies to have monopoly power in their market to engage in damaging anti-competitive practices. We instead see cartel-like cooperation between these companies instead of competition, such as apple-google wage fixing that was revealed years ago. Or their complete insulation from consumer feedback necessary for healthy capitalist markets to function, because their economies of scale insulate them by making consumer choices insignificant signals lost in the noise.
Amazon should be forced to spin off AWS and release Twitch. Facebook should be forced to release Instagram and Whatsapp. Alphabet should be forced to release Youtube, Waymo, etc. That's about the right scale of antitrust required here.
reply