It is pretty well known that
static inline int is_odd_A(int x) { return x & 1; }
is more efficient than
static inline int is_odd_B(int x) { return x % 2; }
But with the optimizer on, will is_odd_B
be no different from is_odd_A
? No — with gcc-4.2 -O2
, we get, (in ARM assembly):
_is_odd_A:
and r0, r0, #1
bx lr
_is_odd_B:
mov r3, r0, lsr #31
add r0, r0, r3
and r0, r0, #1
rsb r0, r3, r0
bx lr
We see that is_odd_B
takes 3 more instructions than is_odd_A
, the main reason is because
((-1) % 2) == -1
((-1) & 1) == 1
However, all the following versions will generate the same code as is_odd_A
:
#include <stdbool.h>
static inline bool is_odd_D(int x) { return x % 2; } // note the bool
static inline int is_odd_E(int x) { return x % 2 != 0; } // note the !=
What does this mean? The optimizer is usually sophisticated enough that, for these simple stuff, the clearest code is enough to guarantee best efficiency.