Fast counting the number of set bits in __m128i register
Here are some codes I used in an old project (there is a research paper about it). The function popcnt8 below computes the number of bits set in each byte. SSE2-only version (based on Algorithm 3 in Hacker’s Delight book): static const __m128i popcount_mask1 = _mm_set1_epi8(0x77); static const __m128i popcount_mask2 = _mm_set1_epi8(0x0F); static inline __m128i … Read more