micro-optimization
Divide by 10 using bit shifts?
Editor’s note: this is not actually what compilers do, and gives the wrong answer for large positive integers ending with 9, starting with div10(1073741829) = 107374183 not 107374182. It is exact for smaller inputs, though, which may be sufficient for some uses. Compilers (including MSVC) do use fixed-point multiplicative inverses for constant divisors, but they … Read more
Floating point division vs floating point multiplication
Yes, many CPUs can perform multiplication in 1 or 2 clock cycles but division always takes longer (although FP division is sometimes faster than integer division). If you look at this answer you will see that division can exceed 24 cycles. Why does division take so much longer than multiplication? If you remember back to … Read more
Why are loops always compiled into “do…while” style (tail jump)?
Related: asm loop basics: While, Do While, For loops in Assembly Language (emu8086) Fewer instructions / uops inside the loop = better. Structuring the code outside the loop to achieve this is very often a good idea. Sometimes this requires “loop rotation” (peeling part of the first iteration so the actual loop body has the … Read more