stdatomic – Make Me Engineer

Why does a std::atomic store with sequential consistency use XCHG?

June 25, 2023 by Tarik

mov-store + mfence and xchg are both valid ways to implement a sequential-consistency store on x86. The implicit lock prefix on an xchg with memory makes it a full memory barrier, like all atomic RMW operations on x86. (x86’s memory-ordering rules essentially make that full-barrier effect the only option for any atomic RMW: it’s both … Read more

Do I have to use atomic for “exit” bool variable?

June 13, 2023 by Tarik

Do I have to use atomic for “exit” bool variable? Yes. Either use atomic<bool>, or use manual synchronization through (for instance) an std::mutex. Your program currently contains a data race, with one thread potentially reading a variable while another thread is writing it. This is Undefined Behavior. Per Paragraph 1.10/21 of the C++11 Standard: The … Read more

Why set the stop flag using `memory_order_seq_cst`, if you check it with `memory_order_relaxed`?

June 1, 2023 by Tarik

mo_relaxed is fine for both load and store of a stop flag There’s also no meaningful latency benefit to stronger memory orders, even if latency of seeing a change to a keep_running or exit_now flag was important. IDK why Herb thinks stop.store shouldn’t be relaxed; in his talk, his slides have a comment that says … Read more

Acquire/Release versus Sequentially Consistent memory order

May 25, 2023 by Tarik

The C++11 memory ordering parameters for atomic operations specify constraints on the ordering. If you do a store with std::memory_order_release, and a load from another thread reads the value with std::memory_order_acquire then subsequent read operations from the second thread will see any values stored to any memory location by the first thread that were prior … Read more

Acquire/release semantics with 4 threads

May 14, 2023 by Tarik

You are thinking in terms of sequential consistency, the strongest (and default) memory order. If this memory order is used, all accesses to atomic variables constitute a total order, and the assertion indeed cannot be triggered. However, in this program, a weaker memory order is used (release stores and acquire loads). This means, by definition … Read more

C++11: the difference between memory_order_relaxed and memory_order_consume

April 28, 2023 by Tarik

Question 1 No. memory_order_relaxed imposes no memory order at all: Relaxed operation: there are no synchronization or ordering constraints, only atomicity is required of this operation. While memory_order_consume imposes memory ordering on data dependent reads (on the current thread) A load operation with this memory order performs a consume operation on the affected memory location: … Read more

Memory order consume usage in C11

April 25, 2023 by Tarik

consume is cheaper than acquire. All CPUs (except DEC Alpha AXP’s famously weak memory model1) do it for free, unlike acquire. (Except on x86 and SPARC-TSO, where the hardware has acq/rel memory ordering without extra barriers or special instructions.) On ARM/AArch64/PowerPC/MIPS/etc weakly-ordered ISAs, consume and relaxed are the only orderings that don’t require any extra … Read more

For purposes of ordering, is atomic read-modify-write one operation or two?

April 22, 2023 by Tarik

Not an answer at the level of the language standard, but some evidence that in practice, the answer can be “two”. And as I guessed in the question, this can happen even if the RMW is seq_cst. I haven’t been able to observe stores being reordered as in the original question, but here is an … Read more

C++ How is release-and-acquire achieved on x86 only using MOV?

November 1, 2022 by Tarik

Implementing 64 bit atomic counter with 32 bit atomics

August 8, 2022 by Tarik

This is a known pattern, called a SeqLock. https://en.wikipedia.org/wiki/Seqlock. (With the simplification that there’s only one writer so no extra support for excluding simultaneous writers is needed.) You don’t need or want the increment of the counter variable itself to use atomic RMW operations. (Unless you’re on a system that can do that cheaply with … Read more