Every time new extensions get added to x86 new patents and copyrights are issued...

jacquesm · 2025-10-16T18:51:07 1760640667

Or you could just limit your compiler to the subset that worked a while ago.

trenchpilgrim · 2025-10-16T18:52:43 1760640763

Sure, but that limits what code you can use. A lot of consumer software won't work without the SSE extensions, for example.

speed_spread · 2025-10-16T19:58:16 1760644696

Legally, could a CPU manufacturer implement the unencumbered ISA in hardware and have a separate corporate entity provide a low-level software compatibility trap for the missing instructions? The CPU could even have functional equivalent (but ISA-incompatible) instructions to make it almost as fast. Kind of like third-party microcode?

dogma1138 · 2025-10-16T20:20:58 1760646058

In theory yes, the problem is that even x86 emulation in hardware in order to run x86 code natively without recompiling can drag you into a legal mess which any western company will avoid.

NVIDIA got pinched for this over a decade ago.

I’m not entirely sure how Qualcomm and Apple didn’t.

But overall the more you try to make an x86 enabled alternative viable the more likely you’ll get served with papers and even if you’ll win it would take a decade and cost 100’s of millions to fight.

Qem · 2025-10-16T19:12:05 1760641925

SSE2 was released circa 2000[1]. Assuming a patent lasts for 20 years, it should be expired for several years now.

[1] https://en.wikipedia.org/wiki/SSE2

trenchpilgrim · 2025-10-16T19:29:43 1760642983

There are further versions of SSE (SSE4 is pretty much a hard requirement on Windows) and a follow-on series, AVX. AVX-512 is from 2016 and AVX10 is from 2023.

Qem · 2025-10-16T19:40:05 1760643605

That makes me wonder if all those vector extensions pilling on top of each other were really that necessary, or if they are mostly a means of keep churning out patents to delay expiration.

Is it possible to just improve the original SSE extensions in a logical backward compatible way? Similar to what AMD did to x86, widening it to x86-64, dooming Intel efforts to push the incompatible Itanium architecture?

dzaima · 2025-10-16T20:07:51 1760645271

SSE3+ & AVX{,2,-512} & co improve on SSE in pretty much the same way that x86-64 improves on x86 - the old thing still works just fine, but the new one is wider, adds new (very useful!) instructions, doesn't copy over others, and (at least partly) uses different encodings.

And an important thing to remember is that there is and never as a single "x86" before x86-64; both Intel and AMD added new instructions as was seen useful in new generations. AVX & co just continue the pattern that's been going on for four decades.

trenchpilgrim · 2025-10-16T19:52:50 1760644370

No, the newer extensions are different opcodes. It's like extending an API, you can't change old function signatures, you have to add new ones. The new ones are legitimately useful, most video games and media production software use them a lot.

bluGill · 2025-10-16T20:58:38 1760648318

Predicting what you will want in a few years is tricky at best. Some things that seem like a great idea are not worth it in the real world and so you pay the price for flexibility nobody uses. Some use case you didn't think of comes along that could really be helped with some tweak you didn't anticipate. thus your flexible architecture is both too flexible and not enough at the same time.

the above is a constant problem in engineering projects more than about 6 months old.

crote · 2025-10-16T20:15:03 1760645703

RISC-V uses a length-agnostic approach, so that would've at least bypassed the need for width-expansion upgrades. But it's something you have to take into account from the very start...

dzaima · 2025-10-16T21:08:25 1760648905

And even that only helps with the length problem, and doesn't help with doing new operations.

For SIMD, baseline x86-64 (i.e. SSE + SSE2) didn't have dynamic shuffles & shifts & blend, float floor/ceil, integer conversions & min/max & 64-bit comparisons & 32-bit mul, just to name things useful for even very boring SIMD; then in AVX2 we also get gather/masked load/store, FMA, and in AVX-512 we get a bunch of neat mask stuff, integer narrowing & rotates, compress.

(much of those things RVV has in its base extension, but RISC-V already has a good number of extensions on top of base RVV for things like like float16/bfloat16, expanded bitwise stuff (Zvbb - rotates/popcount/lzcnt/widening shift), clmul, and a bunch of crypto things; and presumably in a decade there'll be a bunch more things that people will want in their CPUs that'll have no choice but to be new extensions)

astrange · 2025-10-16T22:13:57 1760652837

That's not necessarily a good idea. Small vector uses rely on having nearly no overhead to be faster, so they can't use a generic system.

dzaima · 2025-10-16T23:09:56 1760656196

If the scalable VLEN is the same as the ALU width, which should generally be the target, small vectors would still perform optimally.

Of course if you need less than a VLEN-sized vector you're wasting throughput, but that applies just as much when using 128-bit vectors on AVX-capable hardware, and even worse so on AVX-512-capable (which, while double-pumped or equivalent to some extent on most impls, still has 512-bit-exclusive throughput on most).

jacquesm · 2025-10-16T18:53:36 1760640816

You'd expect some kind of fall-back in place for older CPUs, no?

gary_0 · 2025-10-16T19:06:44 1760641604

Some of SSE is required as part of the x86_64 ABI, and also new versions of Windows (infamously, now) add required CPU extensions so software will often base its requirements on that. And SSE4x is ubiquitous enough (99% of PCs) that some software/games will just require it and simply crash if it can't use those instructions.

wmf · 2025-10-16T19:56:25 1760644585

It looks like many Linux distros require x86-64-v2 from 2008 and they're preparing to move to v3 from 2013. At this rate they'll never support a level with expired patents. https://en.wikipedia.org/wiki/X86-64#Microarchitecture_level...

crote · 2025-10-16T20:20:13 1760646013

Considering there are no meaningful patent-free x86 CPUs in the wild, why should they?

It's just the default optimization level for those distros. If patent-free x86 CPUs become relevant, compiling another set of binaries would be trivial. Until then it doesn't make any sense to kneecap the >99% of x86 deployments by deliberately refusing to use faster and more efficient instructions.

jacquesm · 2025-10-16T20:54:54 1760648094

> Considering there are no meaningful patent-free x86 CPUs in the wild, why should they?

Open core; no ME.

wmf · 2025-10-16T21:00:19 1760648419

That's a fair point. I guess a bigger problem is that a patent-free x86 processor couldn't run any supported version of Windows.

jacquesm · 2025-10-16T21:24:01 1760649841

That's the last thing I would want to do.

trenchpilgrim · 2025-10-16T18:55:55 1760640955

No, often any fallback would be unusuably slow anyway.

monocasa · 2025-10-16T22:04:16 1760652256

Then you lose cmpxchg16b, which is pretty much required for all x86-64 binaries shipping today.

mrpippy · 2025-10-16T20:21:33 1760646093

Do those only apply to hardware implementations? Apple and Microsoft are both shipping x86_64 emulators that support SSE/AVX/AVX2

trenchpilgrim · 2025-10-16T20:27:11 1760646431

They both probably have licenses; Intel stated in 2017 they intended to require licenses for emulators: https://www.forbes.com/sites/tiriasresearch/2017/06/16/intel...

Presumably Apple and Mocrosoft both have counter-leverage of requiring app developers to ship native binaries at some point in the future.

nomel · 2025-10-16T21:54:01 1760651641

Wait, you can patent an operation? Is it not considered an API? I assumed the Java case would meant you couldn't. I would think it would be limited to the hardware implementation, or maybe some specifics of the alg.

trenchpilgrim · 2025-10-17T01:19:23 1760663963

Are you willing to fight Intel's lawyers about it, or are you gonna quietly pay them a fee and move on?

anthk · 2025-10-17T08:17:48 1760689068

Then how does QEMU handle that? Or Bochs?

monocasa · 2025-10-16T22:05:47 1760652347

They said that, but my understanding was that they were really trying to scare apple back on to x86-64. It didn't work, and it was pretty specious anyway.

throwaway81523 · 2025-10-16T19:55:43 1760644543

What happened with Transmeta?

wmf · 2025-10-16T19:59:46 1760644786

I think they were bought by Nvidia and the Denver/Carmel ARM cores were based on Transmeta tech.

monocasa · 2025-10-16T22:06:23 1760652383

And particularly Nvidia had intended to make an x86 core, but the licensing fell through.