That's why I run my server on 7100 chips made for me by Sam Zeloof in his garage on a software stack hand coded by me, on copper I ran personally to everyone's house.
You are joking but working on making decentralization more viable would indeed be more healthy than throwing hands up and accepting Cloudflare as the only option.
One thing I don't understand about Nvidia’s valuation is that right now a small number of algorithms have 'won,' such as Transformers. The data is very important. Compared to the past where customized code was much more common, such as modeling code and HPC, the ecosystem was very important and it was almost impossible to implement all CUDA and related code.
Competitors now only need to optimize for a narrow set of algorithms. If a vendor can run vLLM and Transformers efficiently, a massive market becomes available. Consequently, companies like AMD or Huawei should be able to catch up easily. What, then, is Nvidia’s moat? Is InfiniBand enough?"
You are right to question their moat. My view on this is that there's a lot of pressure from essentially all other trillion dollar companies (MS, Google, Amazon, Apple, etc.) to not get locked into a NVidia only ecosystem. Each of those do their own chips. They also use Nvidia but not exclusively. An Android or IOS phone has no nvidia capable chips whatsoever. Neither do most laptops. Apple's M series CPUs don't support it at all typically. And with the exception of some gaming or workstation class laptops, most windows/linux laptops come with either AMD or Intel GPUs. Or lately Qualcomm ARM based architectures with custom GPUs.
Nvidias valuation and moat are centered around data center class GPUs used for training. I don't think they effectively have that space to themselves for much longer. Google is already using their own TPUs at scale for both training and inference. They still use some Nvidia stuff but they seem to be able to keep that off the critical path for anything that needs to run at "Google scale". OpenAI just ordered a bunch of AMD hardware. A lot of AI engineers use Apple laptops that rely on the M series hardware.
In short, the Cuda moat is shrinking. It's still relevant of course and there are a lot of tooling and frameworks that depend on it. That's why everybody still uses it. But not exclusively. And there's a lot of extremely well funded and active development to cut loose from it. AMD of course wants in. So does Intel. And so does everybody else. This HipKittens thing looks like it makes some big steps towards a more neutral software ecosystem.
Infiniband is being replaced with UEC (and it isn't needed for inference). For inference there is no moat and smart players are buying/renting AMD or Google TPUs.
What you really want to buy is a Ming Mecca chip. Original model came out around 2003, but they've been iterating. These things are bigger than AMD or nvidia silicon, actually even much larger than a gigantic Cerebras wafer, typically 500-900 million USD in price. As you could guess, Ming Mecca is not broadly publicized, historically used for NSA crypto cracking although now adapted to AI and used for data crunching from gathered messages. More recently all those gathered messages have been used for training strategic /tactical intelligence developments to oversee and deploy resources optimally via a cluster of, at least last I heard, 18 Ming Mecca v7 chips
The Coral TPUs are closer if anything to what's in Pixel phones. In particular they're limited to iirc 8-bit integer types, which puts them in a very different category of applications compared to the kind of TPUs being talked about here.
AI is going to be so ubiquitous, something principled and open is going to supersede cuda at some point, as HTML5 did for Flash. CUDA isn't like an x86 vs ARM situation where they can use hardware dominance for decades, it's a higher level language, and being compatible with a wide range of systems benefits NVIDIA and their competitors. They're riding out their relative superiority for now, but we're going to see a standards and interoperability correction sometime soon, imo. NVIDIA will drive it, and it will gain them a few more years of dominance, but afaik nothing in their hardware IP means CUDA compatibility sacrifices performance or efficiency. They're also going to want to compete in the Chinese market, so being flexible about interoperability with their systems gains them a bit of market access that might otherwise be lost.
There's a ton of pressure on the market to decouple nvidia's proprietary software from literally everything important to AI, and they will either gracefully transition and control it, or it will reach a breaking point and someone else will do it for (and to) them. I'm sure they've got finance nerds and quants informing and minmaxing their strategy, so they probably know to the quarter when they'll pivot and launch their FOSS, industry leading standards narrative (or whatever the strategy is.)
> but we're going to see a standards and interoperability correction sometime soon, imo.
I thought this too, in 2015. OpenCL looked really promising, but Apple bailed and neither AMD nor Intel had the funding to keep up with Nvidia's research. It sorta floundered, even though Nvidia GPUs smugly ran OpenCL code with benchmark-leading performance.
Nvidia won the datacenter because of hardware. You could release a perfect CUDA-to-Vulkan translator tomorrow, and they still wouldn't be dethroned until better hardware replaced it. Intel is swirling the drain, Qualcomm is hedging their bets on mobile, AMD is (still) too underfunded - Apple is the only company with the design chops and TSMC inroads to be a serious threat, and they can't release a datacenter product to save their life. It's understandable why people think Nvidia is a monopoly, Team Green is pulling a full-on "Luigi wins by doing nothing" in 2025: https://knowyourmeme.com/memes/luigi-wins-by-doing-absolutel...
The market has almost no pressure to decouple from Nvidia - nobody else has mature solutions. It requires a preestablished player to make a similarly risky play, which might rule out everyone who's sitting at the table.
Uh, Flash died because Apple refused to support it on mobile Safari. Perhaps Flash would have died anyway, but that is the proximate cause. And Apple's competitors were falling over themselves to market Flash support as a competitive advantage vs. iPhone.
To rephrase the OP's point: transformers et al are worth trillions. All the other CUDA uses are worth tens or hundreds of billions. They've totally got that locked up, but researchers is a smaller market than video games.
I don’t think NVDA will have anything like a real moat, and more like whatever the difference was between iOS and Android. The gist of it is, the big bang of AI has happened and that universe is rapidly expanding, just like it once did for smart phones. There is the Apple of AI which is NVDA, and then there is Android (AMD). Moats are irrelevant here because the universe has just started rapidly expanding for them.
Apple didn’t really “win” out against Android, and it would be a very wrong way of measuring what actually happened. Yet, Apple could have been seen as more premium during various points of that timeline. The truth of the matter was, it was never a swimming race at any point in that smartphone timeline. It was simply a flood that you could convince yourself was an orderly race.
I believe the same is happening now, and it’s in Nvidias interest to maintain the narrative that there is a race and they are winning it. Believing something like this during the smartphone era would have been foolish.
By far the easiest way to implement that "small number of algorithms" is with universal number-grinding hardware. Which also protects you against any architectural developments. Hardware takes a damn long time to make.
They also don't actually have a moat in the sense that they have patented technology keeping others out of the game. The other chip makers are coming for their lunch eventually.
It’s all about deeply entrenched ecosystem NVIDIA had been building around CUDA for decades. It’d super hard to replicate this hardware-software platform.
If your competitor has a 5-year lead, and is working as hard as you are, or harder, then you are not gonna catch up any time soon. Also yes networking.
That's only true if future improvements are easy to create as past ones, that customers care as much about those improvements, and there are no other differentiators.
For example, many companies do well by selling a less capable but more affordable and available product.
The thing the "just optimize AI" crowd misses is that this isn't like optimizing a programming language implementation, where even the worst implementation is likely only 100x slower than a good implementation.
AI is millions of times slower than optimal algorithms for most things.
I use Rancher for a hosted Kubernetes cluster on top of dozens of dedicated servers, and so far, it has been super nice. What are the alternatives for CI/CD for a small team (30)?
I'm really curious about the history of spaCy. From my PoV: it grew a lot during the pandemic era, hiring a lot of employees. I remember something about raising money for the first time. It was very competitive in NLP tasks. Now it seems that it has scaled back considerably, with a dramatic reduction in employees and a total slowdown of the project. The v4 version looks postponed. It isn't competitive in many tasks anymore (for tasks such as NER, I get better results by fine-tuning a BERT model), and the transformer integration is confusing.
I’ve had success with fine tuning their transformer model. The issue was that there was only one of them per language, compared to huggingface where you have a choice of many of quality variants that best align with your domain and data.
The SpaCy API is just so nice. I love the ease of iterating over sentences, spans, and tokens and having the enrichment right there. Pipelines are super easy, and patterns are fantastic. It’s just a different use case than BERT.
I feel sad and disappointed in Microsoft for letting the entire Faster CPython team go. I was a big supporter, always leaving positive comments and sharing news about their work. I'd figure the team paid for itself in goodwill alone. What a letdown, Microsoft. You ought to do better.
My concern is that it doesn't just change war, but security in general. I don't think that we have realized the real implications of this technology, especially the fiber optic drones.
> I don't think that we have realized the real implications of this technology
Define “we.” The defence community has been deeply engaged with what’s going on in Ukraine since ‘22. (And the supremacy of sensor fusion in India’s air battle with Pakistan.)
We as a society. I don't want to write down my detailed thoughts on this, but anyone with a red team mind can imagine the implications for personal security.
Kim Stanley Robinson wrote down pretty bluntly what society might do against the vicious nasty foes of the world with drones, in Ministry for the Future. A book very well reviewed by for example Bill Gates. https://www.gatesnotes.com/books/science-fiction/reader/the-...
Alas it feels optimistic to hope that asymmetric confrontation would be downtrodden people of the earth against bad world damaging take-take-take pests. Merely a science fiction. The world having powerful forces working strongly for the world rather than self interest: hardly believable science fiction.
It's cheaper now, it's easier to pull off remotely, but most airports already were vulnerable to terrorist attacks. It feels like the primary mechanism that protected civilian airports is that the weapons you'd use aren't easy to get, and states didn't want to supply their sponsored terror groups with that kind of weaponry because it'd be dangerously close to an act of war and very hard to deny.
Individually, you were never safe by default. Your safety depends on not being an interesting target.
So you know, if instead of being one guy he was a substantial portion of intelligence operatives of a nation-state with significant industrial resources backing him?
Ukraine isn't wealthy, but it's still an entire country.
Bluntly: nothing is safe from drones + a determined operator. No airfield, no aircraft on the ground, no government institution. Drones have changed warfare forever and Ukraine is writing the manual for future operations. What happened today was unthinkable 10 years ago. As one side effect I predict that at least in some places private drone ownership will become illegal. Think about it: for a few hundred K you get to take out a good chunk of a nuclear power's strike capability.
We’re in a strategic imbalance. Cold War air defences were trained on high-value targets, like strategic bombers and spy planes. So currently our air defences are overspecced for something like this.
Nothing about drones makes them inherently undetectable. You just need a different model. I suspect those should be commonplace within 20 years, potentially a decade.
> at least in some places private drone ownership will become illegal
I could see ownership being restricted in wartime. More likely is eager air defences shredding birds on perimeters.
Won't the cat and mouse game ultimately tilt to the side of defense? I imagine automated rifles are basically impossible to dodge. Automated rifles sound much more scary to me. Plant a rifle and wait a year, works on people and drones.
> Won't the cat and mouse game ultimately tilt to the side of defense?
Probably not. Most of the history of war is weapons getting stronger and stronger and defence getting harder and harder. E.g. in ancient times a shield or simple palisade could protect you, now even tanks and trenches are not safe. The days of being able to build a wall along a border and hold it against a peer adversary are long gone and not coming back.
I feel like this correlates with nations getting bigger over time and the square-cube law (or line-square law for national borders?) but I am not smart enough at military stuff to figure it out
I've read that it's kind of the converse - as military technology advances the size of a "minimum viable nation" increases. E.g. as gunpowder technology developed, anywhere that couldn't afford to field a gunpowder military got absorbed into somewhere that could.
On the other hand defensive alliances like NATO and the like pretty much work. A couple of centuries ago war was all over the place. These days most people never see it unless they deliberately go to a war zone.
You need a globe - an old school one, physical, a map of the black soils, population density and to remember how long it took Prigozhin to get to Moscow outskirts, with all the stops, interviews and scuffles with VVS.
To be fair, these planes were out in the open, protected by tires on the wings. If they were in simple hangars, this operation would have already been way harder.
I doubt they're claiming to have anything novel in their heads. It is like WWII where the militarily engaged people probably had a pretty good idea of what was about to happen as Europe descended into war. The citizens didn't really understand and there wasn't the level of diplomacy and panic in the early stages that the eventual crisis would have justified.
If the average citizen had a good understanding of what an industrial war looked like and what was possible, they'd (taking an optimistically charitable view) have spent the 20s and 30s being a lot more vigorous in trying to keep the peace. Like the efforts we say from the 40s to around the 2010s where people who remembered WWII put huge amounts of effort into not letting it happen again.
Real implications are that once again you don’t want your personal shit being public, which will still take some while for gen.audience to understand about social media and all sorts of corporate surveillance.
This is a common mistake and very badly communicated. The GIL do not make the Python code thread-safe. It only protect the internal CPython state. Multi-threaded Python code is not thread-safe today.
Internal cpython state also includes say, a dictionary's internal state. So for practical purposes it is safe. Of course, TOCTOU, stale reads and various race conditions are not (and can never be) protected by the GIL.
It's memory safe, but it's not necessarily free of race conditions! It's not only C extensions that release the GIL, the Python interpreter itself releases the GIL after a certain number of instructions so that other threads can make progress. See https://docs.python.org/3/library/sys.html#sys.getswitchinte....
Certain operations that look atomic to the user are actually comprised of multiple bytecode instructions. Now, if you are unlucky, the interpreter decides to release the GIL and yield to another thread exactly during such instructions. You won't get a segfault, but you might get unexpected results.
This should not have been downvoted. It's true that the GIL does not make python code thread-safe implicitly, you have to either construct your code carefully to be atomic (based on knowledge of how the GIL works) or make use of mutexes, semaphores, etc. It's just memory-safe and can still have races etc.