Does it make any sense to use the LFENCE instruction on x86/x86_64 processors?

Bottom line (TL;DR): LFENCE alone indeed seems useless for memory ordering, however it does not make SFENCE a substitute for MFENCE. The “arithmetic” logic in the question is not applicable. Here is an excerpt from Intel’s Software Developers Manual, volume 3, section 8.2.2 (the edition 325384-052US of September 2014), the same that I used in … Read more

How do I Understand Read Memory Barriers and Volatile

There are read barriers and write barriers; acquire barriers and release barriers. And more (io vs memory, etc). The barriers are not there to control “latest” value or “freshness” of the values. They are there to control the relative ordering of memory accesses. Write barriers control the order of writes. Because writes to memory are … Read more

How does a mutex lock and unlock functions prevents CPU reordering?

The short answer is that the body of the pthread_mutex_lock and pthread_mutex_unlock calls will include the necessary platform-specific memory barriers which will prevent the CPU from moving memory accesses within the critical section outside of it. The instruction flow will move from the calling code into the lock and unlock functions via a call instruction, … Read more

Which is a better write barrier on x86: lock+addl or xchgl?

Quoting from the IA32 manuals (Vol 3A, Chapter 8.2: Memory Ordering): In a single-processor system for memory regions defined as write-back cacheable, the memory-ordering model respects the following principles [..] Reads are not reordered with other reads Writes are not reordered with older reads Writes to memory are not reordered with other writes, with the … Read more