Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just wanted to clarify something about flushing caches: fences do not flush the caches in any way. Inside the CPU there is a data structure called the load store queue. It keeps track of pending loads and stores, of which there could be many. This is done so that the processor can run ahead and request things from the caches or to be populated into the caches without having to stop dead the moment it has to wait for any one access. The memory fencing influences how entries in the load store queue are allowed to provide values to the rest of the CPU execution units. On weak orderes processors like ARM, the load store queue is allowed to forward values to the execution pipelines as soon as they are available from the caches, except if a store and load are to the same address. X86 only allows values to go from loads to the pipeline in program order. It can start operations early, but if it detects that a store comes in for a load that's not the oldest it has to throw away the work done based on the speculated load.

Stores are a little special in that the CPU can declare a store as complete without actually writing data to the cache system. So the stores go into a store buffer while the target cache line is still being acquired. Loads have to check the store buffer. On x86 the store buffer releases values to the cache in order, and on ARM the store buffer drains in any order. However both CPU architectures allow loads to read values from the store buffer without them being in the cache and without the normal load queue ordering. They also allow loads to occur to different addresses before stora. So on x86 a store followed by a load can execute as the load first then the store.

Fences logically force the store buffer to flush and the load queue to resolve values from the cache. So everything before the fence is in the caching subsystem, where standard coherency ensures they're visible when requested. Then new operations start filling the load store queue, but they are known to be later than operations before the fence.



That clarifies fences more for me a little bit more. Thanks for the insight.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: