Not a CPU designer, but my guess is that they will move the cache management logic from the MMU to the µOP scheduler, which will commit to cache on retirement of the speculatively executed instruction. They would then need to introduce some sort of L0 cache, accessible only at the microarchitectural level, bound to a speculative flow, and flushed at retirement.
How does this work for two instructions in the pipeline at the same time that refer to the same cache line? If the second instruction executes the read phase before the first is retired/committed to cache, you would be hit by two memory fetch latencies, significantly hurting performance.
I guess compilers could pad that out with noops to postpone the read until the previous commit is done if they know the design of the pipeline they are targetting. But generically optimized code would take a terrible hit from this.
Firstly, thanks for the question. As mentioned, not a CPU designer or trying to teach Intel what to do. More like relying on the hive mind to see if I have the right idea.
A second instruction in the pipeline would read from the above mentioned L0 cache (let us call it load buffer), much like it would for tentative memory stores from the store buffer.
Also, two memory fetches in parallel are not twice as long as a memory fetch, if that would be the solution (which I guess would not be the case, as I imagine race conditions appearing)
I don't think you can allow two speculatively executing instructions to read from the same L0 cache.
For example say the memory address you want to look for being cached is either 0x100 or 0x200 (not realistic addresses but it works for example) based on some kernel memory bit. Then run instructions in userspace that try to fetch 0x100 (with flushes in between). If you notice one that completes quickly, then it must have used the value 0x100 cached in L0 cache by the kernel? (and also run over 0x200 to try and check when it's cached in L0)
L0 is only used by speculatively executed uOPs, before they are actually committed. Therefore anything that reads from L0 has to be speculatively executed too.
So if the uOP populated the L0 was reading from kernel memory, then it won't be committed. And subsequent uOP read from the L0 won't be committed either. So you can't get timing information from them.
Disclosure: I work at Microsoft, I have worked for Microsoft before and quit, and come back. I love the company, but I am no zealot (tried to standardize a company on Macs during my six years away, because it made sense). I never worked on Windows Phone, but I know the company and the tech well.
Here is where we fucked up:
1. We were, for a long time, a company, where every product/business group had to pay for its own right to exist. Everyone had their own P&L, contribution margin targets, marketing. You had to make money by yourself to stay alive. KT made sure we all understood this.
2. We had a history of "fast follower" successes - Windows, Word, Windows Server, SQL Server, Exchange, IE, even Intune nowadays, and many many others got successful not by disrupting the current market leader or by hardcore innovation, but by leveraging either an open or standard platform and always getting better, without trying to rewrite the rules of the game. OK, maybe Office rewrote them when it came out, but it was packaging.
3. Balmer (whom I love as a leader) got trolled by Apple's and Google's success, and Microsoft graduating from not really cool to quite uncool. So he decided to tackle them the way it had worked before (point 2.). Simultaneously, he tried to correct point 1, but, as radical as his 2014 reorganization to break org barriers was, he did not get rid of KT (Kevin Turner). KT brought in the money, KT defined the culture. Everyone had to keep making their own money.
We could have:
Offered the mobile OS for free from day one.
Given Office on Mobile for free from day one.
Bought or OEMed Xamarin a lot sooner.
Returned 100% of app revenue to app devs who sell through the Windows Store.
Made dev tools (Studio CE) free earlier.
Guaranteed no data collection (remember the Scroogled campaign…?)
All those have either been done, or are irrelevant now, while the stock is still at a record high, after we lost the game... We could have done all of the above and fare better than we have, and we have fared well.
Instead, we comp hardware sellers on MARGIN, as if it makes a bloody difference. We monetize the post install experience. All bullshit for pennies. Everyone had to make money on their own so we missed the bigger picture.
Satya fixed this, and it hurt, as it was the only way left to go. I gave up on a phone I really liked, as I saw no future.
I don't know if I should hope for us bringing new phones out, but I sure hope we never again let our Operating Mechanisms kill our ability to see the big picture and disrupt the market.
> We could have: Offered the mobile OS for free from day one. Given Office on Mobile for free from day one. Bought or OEMed Xamarin a lot sooner. [...] Made dev tools (Studio CE) free earlier.
Quoted for emphasis. It's remarkable in hindsight that Microsoft didn't try to leverage it's own already successful products with their inherent network effects, to buy market-share.
This is not a SKU to play with. If $20/hr is indeed the price (I don't know), this is the hourly cost of a couple of waiters. You get to run SAP on someone's infra and someone to support it.