html
java
php
javascript
c#
math

simd

Transpose an 8×8 float using AVX/AVX2

May 3, 2023 by Tarik

I already answered this question Fast memory transpose with SSE, AVX, and OpenMP. Let me repeat the solution for transposing an 8×8 float matrix with AVX. Let me know if this is any faster than using 4×4 blocks and _MM_TRANSPOSE4_PS. I used it for a kernel in a larger matrix transpose which was memory bound … Read more

Categories simd Tags avx, avx2, simd Leave a comment

Search

react-router-dom v6 Routes showing blank page
How to redirect a user to a specific activity in Cloud Firestore?
jwt.io says Signature Verified even when key is not provided
Matplotlib: Finding out xlim and ylim after zoom
Powersets in Python using itertools
Split a column of concatenated comma-delimited data and recode output as factors
Interactive pixel information of an image
pandas fillna not working
yaxis range display using absolute values rather than offset values
Convert two-digit years to four-digit years with correct century

tech