I was of the opinion that Intel will only face real competition when arm reaches parity with desktop class cpu. Nice to see AMD rise back up again. Hope Intel is not able to strong arm the oems like last time when AMD had design wins but Intel using anti competitive practice stalled its growth till it was able to catch-up and later go ahead.
As with most things the devil is in the details. In particular the ARM I/O and memory bus infrastructure has not had as long to develop as PC systems have.
The trick to understanding that is to look at the history of the personal computer and the demands placed on the CPU and supporting chipset that evolved with the changing demands of the platform over time.
The ARM systems evolved from 'system on chip' or SoC designs where everything was on one chip. And even the latest ARM server designs don't have anything like the notion of an ARM server system where you can choose from a variety of CPU skus and drop them into a baseboard socket.
That isn't to say that they won't eventually get to something like that, it is only to stress that ARM systems design is still a 3 - 6 years behind personal computer systems design. And to date it isn't getting pushed very hard to close that gap.
Interesting, but hardly an adequate/relevant comparison since there are probably a million custom solutions involved, especially in the interconnect/IO parts.
I doubt we'll be seeing desktop ARM systems any time soon, especially high end workstation or enthusiast systems.
The problem is that the ARM64 architecture is completely different from x86-64, which almost every desktop application is compiled for exclusively.
Remember when Microsoft launched an ARM Surface? It had a special version of Windows, and almost nothing would run on it, as very few applications are compiled for Windows on ARM.
> It had a special version of Windows, and almost nothing would run on it, as very few applications are compiled for Windows on ARM.
That wasn't the reason. It was because Windows RT was artificially locked-down to only allow sandboxed "apps" from the Windows App Store to run. There's not much of a use-case for only glorified web-service clients on a machine with a desktop UI regardless of the ISA they're compiled for.
Had MS allowed arbitrary ARM code to run on the Surface RT, free of sandbox restrictions, I think the platform would still be alive today - probably as a successor to Windows CE for kiosk applications.
(Though there is now finally a non-locked-down version of Windows 10 for ARM... so let's see...)
Yeah, it required the Windows Store, and additionally at that time it could only have WinRT (unrelated to Windows RT; basically the old name for UWP) apps. These days you can put Win32 apps on the store. If Microsoft allowed Win32 ARM apps on the store, maybe Windows RT would have succeeded. Maybe. Software publishers probably wouldn't have wanted to lose that control.
Isn't the main problem with ARM the lack of a standard hardware architecture that supports of-the-shelf pc components?
Every motherboard has it's own size, it's own closed bios supporting only some kinds of memory (if at all, due to on die memory of most soc), which is fine for embedded use but leaves a lot to be desired if you try to use it as a "desktop pc" replacement.
As for "never reaching the same performance" as desktop Intel/AMD chips, I would say that depends. It's not that they can't do it with ARM chips, is that nobody has bothered to do it until now, because there was no point with so few UWP apps. The market would be way too small. What would be the point in making a Core i7-level ARM chip just for the Surface RT? The risk would be too big.
But now that Microsoft opened up the PC market to ARM -- truly -- ARM chip makers can begin to make some real desktop-class ARM chips. Chips that require 15, 30, 45, or even 90 watts of power.
But I do think this will happen gradually over several years, because first they need to see how ARM chips work under Windows 10 this time around and how people accept them. Then if that goes well, they need to start designing those 15-90W ARM chips, and then start selling them.
Someone like Qualcomm needs to be sure that if it does design a 90W ARM chip that's highly competitive with a desktop class Intel or AMD chip (at least in terms of performance/dollar), more than 10 million people will buy it.
Doesn't that make this an argument against closed source, not against ARM?
I agree the issues are related, but if us people in open source can switch in an afternoon and you people in closed sourced land can't ever switch it highlights a problem.
Apple switched processors several times. They provided an emulator for old programs, so they could run without the user noticing anything (except worse performance). You could probably do the same for ARM.
It's worth understanding the Mac software ecosystem is mostly made up of people who love the platform and OS.
Windows just isn't the same, lots of devs making software the same way the made it 15+ years ago and not wanting or even caring enough to change no matter what the benefits.
also, apple do a really good job on compilers etc to make it easier to write code and systems that jump between systems quite well. Lots of people dev on windows without using the latest tools from Microsoft this is just not how it works with apple.
I wonder if it's possible to use some kind of translator, like Apple did with Rosetta to run PowerPC code on x86. This is of course not interesting for the Surface but it might be a nice solution for a desktop transition to ARM.
Its not really the actual limit. Nobody is writing architecture-specific code anymore. Being able to binary translate x86 won't change the fact that the program you are trying to run is trying to invoke a thousand libraries for a proprietary OS that will never open up or otherwise become portable, that is also compiled against a system call interface specific to only that proprietary non-standard OS.
If Windows on ARM allowed generic binaries, almost everything in the Windows ecosystem would be on ARM. For Windows.
What made the RISC or ARM processor popular in the past was the ability to do graphic rendering and complex math. GPU's put a dent in that. Making a i386 CISC system with the addition of a GPU have parity with the old SG systems, but substantially cheaper.
That's why an ARM64 Windows desktop is unlikely (unless microsofts x86 emulator is magical). But ARM64 chrome books or android books are practical; I have 5 year old ARM 32 chrome book and it is still useful for browsing and emailing.
Why restrict it to those limited OSes? Seems like an ARM64 laptop running Linux ought to work just as well as the AMD64 laptops running Linux I've been using for the last few years.
I've been using Linux on my desktop (without windows even in a VM) since 2004 and I totally agree - however until Linux is considered a viable option by non-enthusiasts, it's not relevant to this discussion. (I agree it is already relevant and more useful than Windows for the vast majority - but they refuse to even try)
Before you get excited about unleashing the power of 16 cores on your desktop... I have recently upgraded from the 4-core to 6-core i7 with roughly the same frequency/turbo and noticed that compile times have actually increased.
Disappointing, I thought, especially because in addition to extra two cores I now sport a quad-channel DDR RAM. Well, turns out that compiling my Go projects does not generate enough "hot" threads, and the Linux scheduler keeps moving them around between 6 cores, letting the cores to "cool down" and drop to 1.2Ghz idle frequency. Meanwhile the 4-core box is happily spinning at 3.4-3.8Ghz during the compile cycle.
The only way to make the 6-core box to perform was to enable "performance" policy for intel_pstate which leads to the CPU running at its full speed even at idle (not great). Once I did that I saw the performance increase by 50%.
Here are the compile times of the same project on both machines with different CPU frequency scaling policies:
As far as I know, AMD does not have an equivalent of intel_pstate in Linux kernel and relies on legacy ACPI governor which isn't as sophisticated as Intel's (where the CPU is more self-regulating) and I wonder what kind of effects to expect in regards to performance / power consumption if I go for Ryzen.
The bigger question, at least for me, is either high-core machines improve your desktop experience if you're mostly running low-threaded workloads, for example a web browser.
One humble piece of evidence I gathered says that no, more cores make your low-threaded applications run slower because the Linux scheduler keeps placing them onto "cold" cores running on idle frequency, and once they "spin up" the thread gets moved to another cold core.
The huge number of PCIe lanes is pretty exciting to me. For consumers, the limiting factor in PCs these days tends to be I/O. With this, you could have 4 PCIe M2 SSDs (that's a bit under 16Gbps bandwidth, assuming RAID0), and still have 44 lanes (the number in the top-end Intel CPU) to spare for GPUs and other peripherals.
Enthusiast here - can confirm, an array of nvme's makes a huge difference over single standard SSD. I set this up last year, check the benchmark results, it's night and day.
Do games actually load any faster? I went from a Samsung SSD(I think the 850 pro) to an Intel nvme ssd, and load times in games didn't seem much faster. I definitely notice it when booting up VS though.
Gamers might the most vocal high-end PC users out there but they are not the people with the most money.
The threadripper is not targeted at gamers. It is targeted at people who make money from using their machines, these people can, therefore, afford to upgrade.
Consider a user who does video work. (most modern films and tv is filmed in 8k know) if this is in Raw format as it comes off the camera for a 90 min episode that is 6000 GB. Working on this data (tone mapping, cutting, etc) needs all this data to be fed in and out of the (CPU and or GPUs) so all these PCI lanes are there so you can have very fast access to the raw data and still have space for 2 or 4 full speed GPUs to do that work for you.
Forget games no game developer optimizes their games for 4 GPUs these days even Nvida have said that they dont realy want people to do more than 2GPUs in SLI.
I do a lot of compiles of a codebase which is pretty big (Netflix version of FreeBSD). A full recompile takes me ~1 hr on my box (Xeon E5-2630 v3, 8 cores, 16 threads, 32GB DDR4). Most of this is a parallel make that is 100% CPU bound. So I'm probably going to update to thread ripper when it comes out. Doubling my core count should come close to halving my compile times.
I probably will do storage via a ZFS array of small-ish SSDs or NVME drives. I started with a ZFS array of spinning drives when I built this machine ~2 years ago, with an SSD for L2 ARC. I was occasionally IO bound, so I moved most of the build stuff onto its own SSD. That's probably enough, but I want redundancy (without spinning drives to back up to), and I suck at cabling, so I'm leaning towards just using NVME drives in PCIe slots in order to avoid the cables.
I work remotely, and it is easier to develop and compile on the same box. I normally compile just the kernel, but occasionally have to rebuild the entire thing.
if you had access to a hosted build system, which is x10 faster. lets even assume some sort of transparent gradual sync of your source tree, and the binaries back to your system.
would u find value in such a setup or is it not worth the hassle ?
(lets assume the output is veritably binary-identical to your local build)
I use the system more for work than games but one game it did make a very noticeable difference in was Minecraft believe it or not. Long-view distance + pre-generated map (world border fill) = move/fly fast without chunks taking friggin forever to load :-)
A lot of processing happens when games load. Once you have sufficient storage bandwidth and latency, load times flatten out. It seems that "sufficient" is SATA; above that, there is little difference in game load times:
I wonder if game developers are actually disincentivized from qualitatively improving load times past a certain acceptable threshold, lest their AAA game be perceived as lightweight.
I doubt it. Gamers have been complaining about long load times since almost the dawn of time, and I think one with as little as technically possible could make it a selling point.
The poor market penetration of PCIe SSDs and their high price per GB is a strong disincentive for game developers to store data in lightly-compressed formats that require minimal CPU processing to load and use. We might see some improvement in using multiple threads to load game data now that CPU core counts are on the rise at the high end.
Reminds me of the first time I played Kings Quest from a RAM drive. I first played the game from a floppy disk, and it ways really nice to have everything load instantly!
That sounds frivolous versus a single high-end PCIe SSD. Wouldn't be surprised if the improvement in real-world game load times wasn't even perceptible.
Maybe, but that's what being an enthusiast is all about :) also assets (models & textures) need to be loaded from disk during game-play which can result in stutter when gaming. Faster IO would help here.
What is really nice about the full Zen line is they are supporting ECC memory. It always felt like Intel had removed this on lower end chips just to force users to buy a Xeon even if the did not need the rest of the Xeon features.
Yeah, the Xeon-E3 lines lack of PCI-e lanes is really underwhelming. I've got a LSI 9200-8e and Mellanox Connect-X 2 in my ProLiant ML10 used as my FreeNAS box, leaving a total of 4 PCIe lanes available - thankfully I don't see any need for more AIB's in this machine, but if I used it as a workstation that needed a GPU + another board or two it'd be a really awful situation to be in.
Considering how hard Intel is pushing Thunderbolt it makes little sense how hard they're gimping their consumer and workstation CPU's when it comes to PCIe support.
Intel X99 mainboards (socket 2011-3) ranged from 200-600 us dollars. AMDs offerings typically go for less than Intels, but Threadripper has an gigantic 4094 pin LGA socket, so I guess this will negate the typical AMD price advantage regarding motherboards and we will see prices in the 200-600 us dollar range as well.
EDIT - and since Threadripper will have 20 more PCIe lanes compared to Intels i9 series, motherboard makers will hopefully take advantage of this to offer more IO, which probably also increases the price.
I don't fully understand. So 1 SSD need 1 lane. How many lanes need something like NVisia 1080Ti? What use of the remaining lanes? If I want to build something like 1 CPU/1 GPU/1 SSD configuration without additional peripherals, 62 lanes will be effectively wasted?
This is for high-end workstations after all. Many people are suffering with video production. New cameras this year output 8K video raw (or slightly compressed). Even the guys in RED couldn't produce hardware to properly feed the stream into a computer realtime or at least without having to wait.
These comments baffle me. Doesn't the majority of HN work in the industry? Clearly the driving factor regarding IO at this point is machine learning and networking for cloud applications, not gaming.
Desktop/workstation (e.g. gaming) performance hasn't been driving the industry for easily 5 years now. Intel and AMD both simply use their high performance desktop offerings as proofs for their server solutions, which typically come a few months later.
"CUDA" is not a particular application. You could very well be doing something with CUDA that does extremely little data transfer to/from the host, but a lot of computation on the GPU. Like mining :D
Also, host to device can be bottlenecked by memory speed on either side.
CUDA meaning GPGPU. All I was pointing out is that there is definitely a bandwidth doubling going from 8 to 16 lanes. It's used extremely often by people using the cards for compute. Mining may not have that issue, but deep learning and other applications do.
And even for games, the only reason it's not a difference is because game developers target less bandwidth between CPU and GPU so they don't use it all.
If we move to a future where every GPU has a lot more bandwidth then games would most likely start taking advantage of it.
My current “gaming machine” is still using PCIe 1.0 and it doesn't look like it's proving much at all of a bottleneck, because why would it? The game loads most things into memory once and then the rest of the time it's just sending draw calls and maybe geometry updates.
As already said, funding a company is not fair competition. If governments (us, eu, etc) cared for competition in this market (or any other tbh) they could have simply enforced the fair market regulations. Instead they chose to ignore the unfair practices of Intel during the opteron days. In a similar way, if they cared about fair competition and protecting consumers they would n't have let microsoft enforce their windows tax on everybody.
Some years ago I read a moderately well fleshed out proposal to create a European semiconductor consortium as a sort of an Airbus for CPUs. My recollection is that, while it was somewhat flawed, it was an idea with significant merit.
This is coming from an area where Microsoft has a vicegrip on all government software to the tune of 99% of the market, and then runs smear campaigns and tries to bribe politicians in that last 1% to adopt them.
Most politicians are technologically illiterate and when the people they know are those with the most money to get closest its going to be big corporate interests that get their ear.
These large open source code bases represent huge value stores for modern civil societies. We're all better off because they exist and we'd do well to promote them.
IANAL, but I don't think this would be legal under WTO rules. It would be seen as an unfair advantage to AMD, and Intel could complain to the WTO about it.
Something similar happened recently when Brazil accused the Province of Quebec of funding Bomardier (who competes with Embraer). [0]
But would they have the same incentive to take risks? I hear you -- I want them to do well too, and keep making cool things, but giving them too much help may kill the golden eggs. Or something.
We are (or were) rapidly approaching a point where Intel would be the only PC, laptop, and server processor worth considering. If that happened, their only competition would have been themselves, the remnants of AMD, and start-ups who would be behind by hundreds of billions of dollars of infrastructure.
In that situation, would they be likely to keep developing new processors, making them more efficient and more powerful, investing heavily into R&D, and aggressively pricing their products? Or would they give themselves a bonus, their stockholders dividends, and rest on their laurels while taking in the cash for a product that has become essential to the global economy and really the future of humanity?
Or were you speaking of AMD being the good that is working hard, taking big risks, and laying golden eggs because it is fighting an uphill battle?
If so, consider that risks are risks because they don't always work out. Sometimes they fail. And while that works if you are an angel investor risking a few tens of thousands on dozens of start-ups, where any one success can make up for the loss of all the others, it is not as smart when you have a single company you're risking and success means you just get to keep playing a little longer. And having a handful of processor companies takes more than a few tens of thousands each.
> all 64 lanes being enabled for all ThreadRipper SKUs. This will be broken up into 60+4: 60 lanes directly from the CPU for feeding PCIe and M.2 slots, and then another 4 lanes going to the chipset
And
> the 16 core processor will for most purposes be half of an Epyc processor
Epyc (formerly Naples) is the server-grade arch with 8 memory channels, 32 cores and 128 PCIe lanes per socket.
So AMD have done something really clever, it is not 1 big Die but rather 2 smaller (ryzen7) dies linked.
The issue with binding is the larger and more complex your die the more failures you have. But breaking it down to 2 smaller dies that are combined they reduce this since they are already getting relly good yields out of the ryzen7 dies.
Epic is just 4 ryzen7 dies. So no this is not worse bins the 16core threadripper is 2 ryzen7 cores.
Maybe even better bins if the rumors are true, which put the 1998 X at 3.5 ghz for 16 cores (and boost on single core to 3.9 ghz).
But I believe it only when I see it. Especially since the rumors said Threadripper will only have 44 PCIe lanes like Intel, which as we now know was false (and was weird previously because Naples has 128 PCIe lanes).
Ah, so the high-end ThreadRipper would contain the same chips as a high-end Epyc, just twice as many, and likewise for the lower-end variants, I guess? That makes sense.
Intel lowered their prices with the release of the 18 core for 2000$. Before AMD became competitive again you had to pay 1700$ for only 10 cores or go for a 3000$ or 4000$ 18 core Xeon.
EDIT - of course there are also Xeons between 10 and 18 cores, but it should show that you had to pay much more to get 18 cores, or only slightly less for much less cores.
I think AMD have been really clever in how the have built the Zen system, unlike Intel these higher end CPUS are basically just multiple lower end ones put together. This means they don't need different fabrication (they just build ryzen7 CPUs) and given they have such good yields on those chips know they can bring the price down. I would expect this chip to be at most 2x a ryzen 1800.
However, it is not jus the chip cost for these systems its also the mainboard if AMD can get the motherboards to be cheaper then that will make a big difference in the overall price.
These aren't your average consumer/gamer/enthusiast CPUs though. For what it is, the price doesn't look insane at all. And I'm concerned that AMD is not going to be much of a winner here because most people looking at these top-of-the-line workhorse CPUs are probably going to pick the better performer over the cheaper part -- they aren't exactly doing budget shopping.
Basically Intel has been greedy / hasn't had the competition. The result of Moore's law is still quite possible it seems. Even with a slowdown in chip density; computing prices can fall on similar historical curves now that AMD is competing.
Threadripper & EPYC are a truly epic move to jump into the dense GPU workstation and server market! With a Threadripper you get four x16 GPUs without PLX or three + a couple of SSDs.
Time for @nvidia to switch the DGX boxes to @AMD CPUs? :)
Yeah, depending on the configuration it would maybe even be enough for 7 GPU's with 8x PCIe 3.0 (60 of the 64 are feeding PCIe and M.2 slots, 4 go to the chipset). Seems like it will be a nice CPU to use for machine learning.
Other sites also report that it will support 2 TByte of RAM like the single socket Naples/Epyc, but LRDIMM and official/validated ECC support was not mentioned in the stream.
Intel already supports a significant number of PCIe 3.0 lanes with their Xeon E5 chips. If I recall supermicro sells a dual socket Xeon board with 10, 8xPCIe 3.0 sockets.
Yes, massivly deep learning is all about learning from data. With the work I have done I have seen that currently unless you buy a 2xeon server you can't get enough data through to justify haveing 4 GPUs on one machine. So commonly its cheaper to break out to multiple machines, but this then has an impact on how you code your learning since communication between the GPUs then is much slower.
Think of training a NN to learn cats expressions and such Meme txt to go along with pictures of cats. So 1) you collect as many cat memes as you can (easy,... 40k memes of cats later) you start training... you need to normalise all the images you need to extract the meme txt and normalise this and you need to feed this through your tenso flow (or other system) to train it. This means pusshing your small data sample of 40k through the GPU and back out and possibly doing this a lot, there is no way they will all fit on your GPU and even if they do you need to get them on there in the firrst place.
Does any remember the DEC Alpha? Windows NT 4.0 was ported to it. The Alpha was a 64-bit RISC processor. (Reduced Instruction Set Computer) An ARM is just an Advanced RISC Machine. RISC architecture has an instruction execution advantage over CISC (Complex Instruction Set Computer) that is what Intel is famous for. Although they did have a i860 and i960 RISC which many of them were for embedded military systems.
Other RISC systems were from Silicon Graphics, with their MIPS processor were used for 3D graphic processing. RISC has shown that it has its place in the computing world time and time again.
My personal appeal is their low power requirements and compact designs. Obviously cell phone manufactures like it too for the same reasons.
Yes, I actually managed a SQLServer computer running on a DEC Alpha machine. I don't remember anything specifically wonderful about the DEC other than it was one of the biggest servers I had seen at the time and it had a large ducted fan in the front for cooling. Neato!
No, there was also no word on official/validated ECC support, only that Threadripper will launch this summer. But people hope for something between 999$ (since it is two Ryzen R7) and 1499$ for the 16 core CPU. 999$ would be awesome, but 1499$ sounds much more realistic, maybe even more, since it has a much larger socket, high end platform bonus etc.
> there was also no word on official/validated ECC support
I think given all the other Zen CPUs support it i think that is a given.
> 999$ (since it is two Ryzen R7) and 1499$
I think it will be more around the 999$ but the motherboard will be quite expensive, it is after all basically a duel CPU main board. Don't expect to pick up a board for under 500$
I think that in the end this will be down the mainboard makers to get validation so possibly will not be validated on non-server style boards. But if there is demand given epic is basicly the same thing just bigger im sure they will do it.