One thing I don't understand about Nvidia’s valuation is that right now a small number of algorithms have 'won,' such as Transformers. The data is very important. Compared to the past where customized code was much more common, such as modeling code and HPC, the ecosystem was very important and it was almost impossible to implement all CUDA and related code.
Competitors now only need to optimize for a narrow set of algorithms. If a vendor can run vLLM and Transformers efficiently, a massive market becomes available. Consequently, companies like AMD or Huawei should be able to catch up easily. What, then, is Nvidia’s moat? Is InfiniBand enough?"
You are right to question their moat. My view on this is that there's a lot of pressure from essentially all other trillion dollar companies (MS, Google, Amazon, Apple, etc.) to not get locked into a NVidia only ecosystem. Each of those do their own chips. They also use Nvidia but not exclusively. An Android or IOS phone has no nvidia capable chips whatsoever. Neither do most laptops. Apple's M series CPUs don't support it at all typically. And with the exception of some gaming or workstation class laptops, most windows/linux laptops come with either AMD or Intel GPUs. Or lately Qualcomm ARM based architectures with custom GPUs.
Nvidias valuation and moat are centered around data center class GPUs used for training. I don't think they effectively have that space to themselves for much longer. Google is already using their own TPUs at scale for both training and inference. They still use some Nvidia stuff but they seem to be able to keep that off the critical path for anything that needs to run at "Google scale". OpenAI just ordered a bunch of AMD hardware. A lot of AI engineers use Apple laptops that rely on the M series hardware.
In short, the Cuda moat is shrinking. It's still relevant of course and there are a lot of tooling and frameworks that depend on it. That's why everybody still uses it. But not exclusively. And there's a lot of extremely well funded and active development to cut loose from it. AMD of course wants in. So does Intel. And so does everybody else. This HipKittens thing looks like it makes some big steps towards a more neutral software ecosystem.
Infiniband is being replaced with UEC (and it isn't needed for inference). For inference there is no moat and smart players are buying/renting AMD or Google TPUs.
What you really want to buy is a Ming Mecca chip. Original model came out around 2003, but they've been iterating. These things are bigger than AMD or nvidia silicon, actually even much larger than a gigantic Cerebras wafer, typically 500-900 million USD in price. As you could guess, Ming Mecca is not broadly publicized, historically used for NSA crypto cracking although now adapted to AI and used for data crunching from gathered messages. More recently all those gathered messages have been used for training strategic /tactical intelligence developments to oversee and deploy resources optimally via a cluster of, at least last I heard, 18 Ming Mecca v7 chips
The Coral TPUs are closer if anything to what's in Pixel phones. In particular they're limited to iirc 8-bit integer types, which puts them in a very different category of applications compared to the kind of TPUs being talked about here.
AI is going to be so ubiquitous, something principled and open is going to supersede cuda at some point, as HTML5 did for Flash. CUDA isn't like an x86 vs ARM situation where they can use hardware dominance for decades, it's a higher level language, and being compatible with a wide range of systems benefits NVIDIA and their competitors. They're riding out their relative superiority for now, but we're going to see a standards and interoperability correction sometime soon, imo. NVIDIA will drive it, and it will gain them a few more years of dominance, but afaik nothing in their hardware IP means CUDA compatibility sacrifices performance or efficiency. They're also going to want to compete in the Chinese market, so being flexible about interoperability with their systems gains them a bit of market access that might otherwise be lost.
There's a ton of pressure on the market to decouple nvidia's proprietary software from literally everything important to AI, and they will either gracefully transition and control it, or it will reach a breaking point and someone else will do it for (and to) them. I'm sure they've got finance nerds and quants informing and minmaxing their strategy, so they probably know to the quarter when they'll pivot and launch their FOSS, industry leading standards narrative (or whatever the strategy is.)
> but we're going to see a standards and interoperability correction sometime soon, imo.
I thought this too, in 2015. OpenCL looked really promising, but Apple bailed and neither AMD nor Intel had the funding to keep up with Nvidia's research. It sorta floundered, even though Nvidia GPUs smugly ran OpenCL code with benchmark-leading performance.
Nvidia won the datacenter because of hardware. You could release a perfect CUDA-to-Vulkan translator tomorrow, and they still wouldn't be dethroned until better hardware replaced it. Intel is swirling the drain, Qualcomm is hedging their bets on mobile, AMD is (still) too underfunded - Apple is the only company with the design chops and TSMC inroads to be a serious threat, and they can't release a datacenter product to save their life. It's understandable why people think Nvidia is a monopoly, Team Green is pulling a full-on "Luigi wins by doing nothing" in 2025: https://knowyourmeme.com/memes/luigi-wins-by-doing-absolutel...
The market has almost no pressure to decouple from Nvidia - nobody else has mature solutions. It requires a preestablished player to make a similarly risky play, which might rule out everyone who's sitting at the table.
Uh, Flash died because Apple refused to support it on mobile Safari. Perhaps Flash would have died anyway, but that is the proximate cause. And Apple's competitors were falling over themselves to market Flash support as a competitive advantage vs. iPhone.
To rephrase the OP's point: transformers et al are worth trillions. All the other CUDA uses are worth tens or hundreds of billions. They've totally got that locked up, but researchers is a smaller market than video games.
I don’t think NVDA will have anything like a real moat, and more like whatever the difference was between iOS and Android. The gist of it is, the big bang of AI has happened and that universe is rapidly expanding, just like it once did for smart phones. There is the Apple of AI which is NVDA, and then there is Android (AMD). Moats are irrelevant here because the universe has just started rapidly expanding for them.
Apple didn’t really “win” out against Android, and it would be a very wrong way of measuring what actually happened. Yet, Apple could have been seen as more premium during various points of that timeline. The truth of the matter was, it was never a swimming race at any point in that smartphone timeline. It was simply a flood that you could convince yourself was an orderly race.
I believe the same is happening now, and it’s in Nvidias interest to maintain the narrative that there is a race and they are winning it. Believing something like this during the smartphone era would have been foolish.
By far the easiest way to implement that "small number of algorithms" is with universal number-grinding hardware. Which also protects you against any architectural developments. Hardware takes a damn long time to make.
They also don't actually have a moat in the sense that they have patented technology keeping others out of the game. The other chip makers are coming for their lunch eventually.
It’s all about deeply entrenched ecosystem NVIDIA had been building around CUDA for decades. It’d super hard to replicate this hardware-software platform.
If your competitor has a 5-year lead, and is working as hard as you are, or harder, then you are not gonna catch up any time soon. Also yes networking.
That's only true if future improvements are easy to create as past ones, that customers care as much about those improvements, and there are no other differentiators.
For example, many companies do well by selling a less capable but more affordable and available product.
The thing the "just optimize AI" crowd misses is that this isn't like optimizing a programming language implementation, where even the worst implementation is likely only 100x slower than a good implementation.
AI is millions of times slower than optimal algorithms for most things.
Competitors now only need to optimize for a narrow set of algorithms. If a vendor can run vLLM and Transformers efficiently, a massive market becomes available. Consequently, companies like AMD or Huawei should be able to catch up easily. What, then, is Nvidia’s moat? Is InfiniBand enough?"