Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was surprised by how poorly poised Intel was to act on the "Cambrian explosion" of AGI late last year. After the release of their Intel Arc GPUs, it took almost two quarters for their Intel Extensions for PyTorch/TensorFlow to be released, to middling support and interest, which hasn't changed much, today.

How many of us learned ML using Compute Sticks, OpenVINO and OneAPI or another of their libraries or frameworks, or their great documentation? It's like they didn't really believe in it outside of research.

What irony is it when a bedrock of "AI" fails to dream?



Maybe I'm thinking about it too simply but yeah I agree.

Language models in particular are very similar architectures and effectively a lot of dot products. And running them on GPU's is arguably overkill. Look at llama.cpp for the way the industry is going. I want a fast parallel quantized dot product instruction on a CPU, and I want the memory bandwidth to keep it loaded up. Intel should be able to deliver that, with none of the horrible baggage that comes from CUDA and nvidia drivers.


Does Intel have a credibility problem w.r.t. ISA extensions to support deep learning?

I'm thinking about the widespread confusion they caused by having different CPUs support different subsets of the AVX-512 ISA.


This reads like parody (from llama.cpp, to it being a beacon of where industry is going (!?), to GPUs are overkill for what is effectively a lot of dot products)


Yeah using CPUs for inference or training is ridiculous. We're talking 1/20th the performance for 1/4th the energy


The reason CUDA has won is precisily because it isn't horribly stuck in a C dialect, have embraced polyglot workloads since 2010, have a great developer experience where GPUs can be debugged like regular CPUs, and the library ecosystem.

Now while NVidia is making standard C++ run on CUDA, Intel is still having SYSCL and oneAPI extensions.

Similarly with Python and RAPIDS framework.

Intel and AMD have to up their game for the same kind of developer experience.


Err.. Last time I checked, CUDA was the one with the partially compliant C++ implementation, while, on the contrary, SYSCL was being base on pure C++17..


Time to check again, as CUDA is C++20 for a bit now (minus modules), and NVidia is the one driving the senders/receivers work for C++26, based on their CUDA libraries.

SYCL isn't pure C++, meaning writing STL code that goes into the GPU, like CUDA allows for, nor requires the hardware to follow C++ memory model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: