How to trap unaligned memory access?

Linux can do the fixup for you or warn about the access. You can enable the behavior in /proc/cpu/alignment, see http://www.mjmwired.net/kernel/Documentation/arm/mem_alignment for an explanation of the different values. 0 – Do nothing (default behavior) 1 – Warning in kernel-log with PC and Memory-Address printed. 2 – Fixup error 3 – Warn and Fixup 4 – … Read more

Questions about Hinnant’s stack allocator

I’ve been using Howard Hinnant’s stack allocator and it works like a charm, but some details of the implementation are a little unclear to me. Glad it’s been working for you. 1. Why are global operators new and delete used? The allocate() and deallocate() member functions use ::operator new and ::operator delete respectively. Similarly, the … Read more

When extending a padded struct, why can’t extra fields be placed in the tail padding?

Short answer (for the C++ part of the question): The Itanium ABI for C++ prohibits, for historical reasons, using the tail padding of a base subobject of POD type. Note that C++11 does not have such a prohibition. The relevant rule 3.9/2 that allows trivially-copyable types to be copied via their underlying representation explicitly excludes … Read more

Why is GCC pushing an extra return address on the stack?

Update: gcc8 simplifies this at least for normal use-cases (-fomit-frame-pointer, and no alloca or C99 VLAs that require variable-size allocation). Perhaps motivated by increasing usage of AVX leading to more functions wanting a 32-byte aligned local or array. Except for main in 32-bit code, then it still does the full return address+frame-pointer backtrace-friendly version even … Read more

Alignment requirements for atomic x86 instructions vs. MS’s InterlockedCompareExchange documentation?

x86 does not require alignment for a lock cmpxchg instruction to be atomic. However, alignment is necessary for good performance. This should be no surprise, backward compatibility means that software written with a manual from 14 years ago will still run on today’s processors. Modern CPUs even have a performance counter specifically for split-lock detection … Read more

MARS MIPS simulator’s built-in assembler aligns more than requested?

TL:DR: MARS tooltips are misleading; you need to disable auto-alignment for the rest of the section using .align 0. You can’t just under-align the next word. .align 1 does align by 2, that’s not the problem. e.g. try it between .byte or .ascii pseudo-instructions. e.g. this source produces 0x00110062 as the first word of the … Read more

Is there a GCC keyword to allow structure-reordering?

Previous GCC versions have the -fipa-struct-reorg option to allow structure reordering in -fwhole-program + -combine mode. -fipa-struct-reorg Perform structure reorganization optimization, that change C-like structures layout in order to better utilize spatial locality. This transformation is affective for programs containing arrays of structures. Available in two compilation modes: profile-based (enabled with -fprofile-generate) or static (which … Read more

What’s the actual effect of successful unaligned accesses on x86?

It depends on the instruction(s), for most x86 SSE load/store instructions (excluding unaligned variants), it will cause a fault, which means it’ll probably crash your program or lead to lots of round trips to your exception handler (which means almost or all performance is lost). The unaligned load/store variants run at double the amount of … Read more