Depends on what the native hardware does.
-
If the hardware is (or is like) x86 with legacy x87 math, float and double are both extended (for free) to an internal 80-bit format, so both have the same performance (except for cache footprint / memory bandwidth)
-
If the hardware implements both natively, like most modern ISAs (including x86-64 where SSE2 is the default for scalar FP math), then usually most FPU operations are the same speed for both. Double division and sqrt can be slower than float, as well as of course being significantly slower than multiply or add. (Float being smaller can mean fewer cache misses. And with SIMD, twice as many elements per vector for loops that vectorize).
-
If the hardware implements only double, then float will be slower if conversion to/from the native double format isn’t free as part of float-load and float-store instructions.
-
If the hardware implements float only, then emulating double with it will cost even more time. In this case, float will be faster.
-
And if the hardware implements neither, and both have to be implemented in software. In this case, both will be slow, but double will be slightly slower (more load and store operations at the least).
The quote you mention is probably referring to the x86 platform, where the first case was given. But this doesn’t hold true in general.
Also beware that x * 3.3 + y
for float x,y will trigger promotion to double for both variables. This is not the hardware’s fault, and you should avoid it by writing 3.3f
to let your compiler make efficient asm that actually keeps numbers as floats if that’s what you want.