128 bit integer on cuda?

For best performance, one would want to map the 128-bit type on top of a suitable CUDA vector type, such as uint4, and implement the functionality using PTX inline assembly. The addition would look something like this: typedef uint4 my_uint128_t; __device__ my_uint128_t add_uint128 (my_uint128_t addend, my_uint128_t augend) { my_uint128_t res; asm (“add.cc.u32 %0, %4, %8;\n\t” … Read more

How can I add and subtract 128 bit integers in C or C++ if my compiler does not support them?

If all you need is addition and subtraction, and you already have your 128-bit values in binary form, a library might be handy but isn’t strictly necessary. This math is trivial to do yourself. I don’t know what your compiler uses for 64-bit types, so I’ll use INT64 and UINT64 for signed and unsigned 64-bit … Read more